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This  report  covers  the  period  May  24  2001  through  January  31  2003  and  documents 
work  performed  by  SRI  International  for  the  DARPA  NEST  program  through  AFRL- 
WPAFB  Contract  F33615-01-C-1908. 

Statically  scheduled  systems  such  as  the  time-triggered  architecture  (TTA)  have  advan¬ 
tages  over  dynamically  scheduled  systems  in  that  they  can  achieve  higher  resource  utiliza¬ 
tion,  can  more  easily  tolerate  certain  failure  modes  (e.g.,  babbling),  and  it  is  easier  to  pro¬ 
vide  assurance  arguments  for  their  safety  and  dependability.  On  the  other  hand,  statically 
scheduled  system  arc  less  flexible  and  less  able  to  adapt  to  changing  mission  requirements 
or  to  a  significant  change  in  available  resources.  Traditionally,  such  adaptation  had  to  be 
pre-planned  and  implemented  as  a  mode  change. 

In  this  project,  our  original  goal  was  to  develop  technology  for  reconfiguring  time- 
triggered  architectures  during  operation.  This  requires  the  ability  to  adapt  existing  sched¬ 
ules  or  to  calculate  new  ones  online:  computations  that  used  to  require  an  overnight  run 
must  be  reduced  to  seconds.  Our  initial  work  focussed  on  TTA  and  on  fast  scheduling.  The 
first  two  papers  below  describe  these  aspects. 

However,  because  there  was  no  interest  in  time-triggered  solutions  among  other  pro¬ 
gram  participants,  nor  opportunity  to  integrate  this  approach  with  the  Open  Experimental 
Platforms,  we  focussed  our  later  work  on  the  run-time  synthesis  aspects.  This  led  to  the 
breakthrough  that  we  call  “lazy  theorem  proving”  which  combines  the  fast  global  search 
of  modern  SAT  solvers  with  an  efficient  decision  procedure  for  a  combination  of  impor¬ 
tant  theories  including  linear  arithmetic.  At  a  stroke,  this  allows  all  applications  of  SAT 
solving  (e.g.,  planning,  diagnosis,  bounded  model  checking)  to  be  extended  from  models 
described  on  purely  Boolean  structures  to  those  over  the  richer  domains  covered  by  the 
decision  procedures. 

The  outputs  of  this  research  arc  documented  in  a  series  of  technical  papers  that  arc 
collected  in  Part  II  of  this  report.  Below,  we  provide  an  index  and  abstracts  for  these  papers. 
All  of  them  were  selected  for  presentation  at  major  scientific  conferences,  and  we  also 
provide  citations  for  these  publications. 

Bus  Architectures  for  Safety-Critical  Embedded  Systems  by  John  Rushby.  Published 
as  [1], 

Our  initial  work  focussed  on  the  Time  Triggered  Architecture  (TTA).  This  paper 
presents  a  comparison  of  TTA  with  other  architectures  for  safety-critical  embedded 
systems. 


Abstract  Embedded  systems  for  safety-critical  applications  often  integrate  mul¬ 
tiple  “functions”  and  must  generally  be  fault-tolerant.  These  requirements  lead  to 
a  need  for  mechanisms  and  services  that  provide  protection  against  fault  propaga¬ 
tion  and  ease  the  construction  of  distributed  fault-tolerant  applications.  A  number  of 
bus  architectures  have  been  developed  to  satisfy  this  need.  This  paper  reviews  the 
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requirements  on  these  architectures,  the  mechanisms  employed,  and  the  services  pro¬ 
vided.  Four  representative  architectures  (SAFEbus™,  SPIDER,  TTA,  and  FlexRay) 
arc  briefly  described. 

On  the  Composition  of  Real-Time  Schedulers  by  Weirong  Wang  and  Aloysius  K.  Mok. 
Published  as  [2]. 

The  disadvantage  of  architectures  such  as  TTA  is  that  they  depend  on  pre-computed 
schedules.  These  arc  expensive  to  compute,  and  inflexible.  This  paper  develops  meth¬ 
ods  for  constructing  schedules  for  composite  systems  from  those  of  their  components 
and  provides  first  steps  toward  more  flexible  static  schedules. 


Abstract  A  complex  real-time  embedded  system  may  consist  of  multiple  applica¬ 
tion  components  each  of  which  has  its  own  timeliness  requirements  and  is  scheduled 
by  component-specific  schedulers.  At  run-time,  the  schedules  of  the  components  arc 
integrated  to  produce  a  system-level  schedule  of  jobs  to  be  executed.  We  formalize 
the  notions  of  schedule  composition,  task  group  composition  and  component  com¬ 
position.  Two  algorithms  for  performing  composition  arc  proposed.  The  first  one  is 
an  extended  Earliest  Deadline  First  algorithm  which  can  be  used  as  a  composability 
test  for  schedules.  The  second  algorithm,  the  Harmonic  Component  Composition 
algorithm  (HCC)  provides  an  online  admission  test  for  components.  HCC  applies  a 
rate  monotonic  classification  of  workloads  and  is  a  hard  real-time  solution  because 
responsive  supply  of  a  shared  resource  is  guaranteed  for  in-budget  workloads.  HCC 
is  also  efficient  in  terms  of  composability  and  requires  low  computation  cost  for  both 
admission  control  and  dispatch  of  resources. 

Lazy  Theorem  Proving  for  Bounded  Model  Checking  over  Infinite  Domains  by 

Leonardo  de  Moura,  Harald  RueB,  and  Maria  Sorea.  Published  as  [3]. 

Our  work  on  the  run-time  synthesis  aspect  of  NEST  focussed  on  integration  of  de¬ 
cision  procedures  with  SAT  solving.  This  is  the  seminal  paper  that  first  described 
the  integration  based  on  “lazy  theorem  proving.”  Facts  discovered  by  the  decision 
procedures  are  communicated  to  the  SAT  solver  as  additional  lemmas  that  prune  its 
search  space.  The  method  is  “lazy”  in  that  these  lemmas  arc  generated  only  as  they 
are  needed. 


Abstract  We  investigate  the  combination  of  propositional  SAT  checkers  with 
domain-specific  theorem  provers  as  a  foundation  for  bounded  model  checking  over 
infinite  domains.  Given  a  program  M  over  an  infinite  state  type,  a  linear  temporal 
logic  formula  ’  with  domain-specific  constraints  over  program  states,  and  an  upper 
bound  k  ,  our  procedure  determines  if  there  is  a  falsifying  path  of  length  k  to  the 
hypothesis  that  M  satisfies  the  specification  ’.  This  problem  can  be  reduced  to  the 
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satisfiability  of  Boolean  constraint  formulas.  Our  verification  engine  for  these  kinds 
of  formulas  is  lazy  in  that  propositional  abstractions  of  Boolean  constraint  formulas 
arc  incrementally  refined  by  generating  lemmas  on  demand  from  an  automated  anal¬ 
ysis  of  spurious  counterexamples  using  theorem  proving.  We  exemplify  bounded 
model  checking  for  timed  automata  and  for  RTL  level  descriptions,  and  investigate 
the  lazy  integration  of  SAT  solving  and  theorem  proving. 

Lemmas  on  Demand  for  Satisfiability  Solvers  by  Leonardo  de  Moura  and  Harald  RueB. 
Published  as  [4]. 

Another  way  to  look  at  “lazy  theorem  proving”  is  as  a  method  for  generating  “lem¬ 
mas  on  demand.”  This  paper  describes  the  method,  and  the  interaction  between  the 
decision  procedures  and  the  SAT  solver  in  more  detail. 

Abstract  We  investigate  the  combination  of  propositional  SAT  checkers  with  con¬ 
straint  solvers  for  domain-specific  theories  such  as  linear  arithmetic,  arrays,  lists  and 
the  combination  thereof.  Our  procedure  realizes  a  lazy  approach  to  satisfiability 
checking  of  propositional  constraint  formulas  by  iteratively  refining  Boolean  formu¬ 
las  based  on  lemmas  generated  on  demand  by  constraint  solvers. 

Embedded  Deduction  With  ICS  by  Leonardo  de  Moura,  Harald  RueB,  John  Rushby,  and 
Natarajan  Shankar.  Published  as  [5], 

An  implementation  of  the  method  described  in  the  previous  two  papers  is  made  freely 
available  by  SRI  as  the  tool  ICS  (available  from  ics  .  csl .  sri  .  com).  This  paper 
describes  the  design  decisions  and  capabilities  of  ICS. 


Abstract  Formal  analyses  can  provide  valuable  assurance  for  high  confidence  soft¬ 
ware  and  systems.  The  analyses  can  range  from  strong  typechecking  through  test  case 
generation  and  static  analysis  to  model  checking  and  full  verification.  In  ah  cases, 
the  tools  that  support  the  analyses  use  formal  deduction  in  some  way  or  other.  ICS 
is  a  fully  automatic,  high-performance  decision  procedure  for  a  broad  combination 
of  theories  that  can  be  embedded  in  all  tools  of  this  kind  to  provide  them  with  a  core 
deductive  capability  of  exceptional  power  and  performance.  We  describe  the  design 
choices  underlying  ICS  and  the  capabilities  it  provides. 

Bounded  Model  Checking  and  Induction:  From  Refutation  to  Verification  by 

Leonardo  de  Moura,  Harald  RueB,  and  Maria  Sorea.  Published  as  [6], 

One  of  the  main  applications  of  ICS  is  bounded  model  checking  (BMC).  Originally, 
BMC  was  seen  as  a  refutational  (i.e.,  debugging)  activity,  but  it  was  soon  applied  to 
synthesis  problems  such  as  planning  and  test  case  generation.  Then  we  discovered 
that  variations  on  BMC  can  be  used  to  perform  verification  in  a  very  effective  and 
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automated  manner.  This  paper  describes  the  methods  by  which  BMC  with  ICS  is 
extended  to  verification  problems. 

Abstract  We  explore  the  combination  of  bounded  model  checking  and  induction 
for  proving  safety  properties  of  infinite-state  systems.  In  particular,  we  define  a  gen¬ 
eral  k  -induction  scheme  and  prove  completeness  thereof.  A  main  characteristic  of 
our  methodology  is  that  strengthened  invariants  arc  generated  from  failed  k  -induction 
proofs.  This  strengthening  step  requires  quantifier-elimination,  and  we  propose  a  lazy 
quantifier  elimination  procedure,  which  delays  expensive  computations  of  disjunc¬ 
tive  normal  forms  when  possible.  The  effectiveness  of  induction  based  on  bounded 
model  checking  and  invariant  strengthening  is  demonstrated  using  infinite-state  sys¬ 
tems  ranging  from  communication  protocols  to  timed  automata  and  (linear)  hybrid 
automata. 
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Abstract.  Embedded  systems  for  safety-critical  applications  often  integrate  mul¬ 
tiple  "functions”  and  must  generally  be  fault-tolerant.  These  requirements  lead  to 
a  need  for  mechanisms  and  services  that  provide  protection  against  fault  propaga¬ 
tion  and  ease  the  construction  of  distributed  fault-tolerant  applications.  A  number 
of  bus  architectures  have  been  developed  to  satisfy  this  need.  This  paper  reviews 
the  requirements  on  these  architectures,  the  mechanisms  employed,  and  the  ser¬ 
vices  provided.  Four  representative  architectures  (SAFEbus™,  SPIDER,  TTA, 
and  FlexRay)  are  briefly  described. 


1  Introduction 

Embedded  systems  generally  operate  as  closed-loop  control  systems:  they  repeatedly 
sample  sensors,  calculate  appropriate  control  responses,  and  send  those  responses  to 
actuators.  In  safety-critical  applications,  such  as  fly-  and  drive-by-wire  (where  there  are 
no  direct  connections  between  the  pilot  and  the  aircraft  control  surfaces,  nor  between 
the  driver  and  the  car  steering  and  brakes),  requirements  for  ultra-high  reliability  de¬ 
mand  fault  tolerance  and  extensive  redundancy.  The  embedded  system  then  becomes  a 
distributed  one,  and  the  basic  control  loop  is  complicated  by  mechanisms  for  synchro¬ 
nization,  voting,  and  redundancy  management. 

Systems  used  in  safety-critical  applications  have  traditionally  been  federated,  mean¬ 
ing  that  each  “function”  (e.g.,  autopilot  or  autothrottle  in  an  aircraft,  and  brakes  or  sus¬ 
pension  in  a  car)  has  its  own  fault-tolerant  embedded  control  system  with  only  minor 
interconnections  to  the  systems  of  other  functions.  This  provides  a  strong  barrier  to 
fault  propagation:  because  the  systems  supporting  different  functions  do  not  share  re¬ 
sources,  the  failure  of  one  function  has  little  effect  on  the  continued  operation  of  others. 
The  federated  approach  is  expensive,  however  (because  each  function  has  its  own  repli¬ 
cated  system),  so  recent  applications  are  moving  toward  more  integrated  solutions  in 
which  some  resources  are  shared  across  different  functions.  The  new  danger  here  is 

*  This  research  was  supported  by  the  DARPA  MOBIES  and  NEST  programs  through  USAF 
Rome  Laboratory  contracts  F33615-00-C-1700  and  F33615-01-C-1908,  and  by  NASA  Lan¬ 
gley  Research  Center  under  contract  NAS1-20334  and  Cooperative  Agreement  NCC-1-377 
with  Honeywell  Incorporated. 
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that  faults  may  propagate  from  one  function  to  another;  partitioning  is  the  problem  of 
restoring  to  integrated  systems  the  strong  defenses  against  fault  propagation  that  are 
naturally  present  in  federated  systems.  A  dual  issue  is  that  of  strong  composability: 
here  we  would  like  to  take  separately  developed  functions  and  have  them  run  without 
interference  on  an  integrated  system  platform  with  negligible  integration  effort. 

The  problems  of  fault  tolerance,  partitioning,  and  strong  composability  are  chal¬ 
lenging  ones.  If  handled  in  an  ad-hoc  manner,  their  mechanisms  can  become  the  pri¬ 
mary  sources  of  faults  and  of  unreliability  in  the  resulting  architecture  [10].  Fortunately, 
most  aspects  of  these  problems  are  independent  of  the  particular  functions  concerned, 
and  they  can  be  handled  in  a  principled  and  correct  manner  by  generic  mechanisms 
implemented  as  an  architecture  for  distributed  embedded  systems. 

One  of  the  essential  services  provided  by  this  kind  of  architecture  is  communica¬ 
tion  of  information  from  one  distributed  component  to  another,  so  a  (physical  or  logical) 
communication  bus  is  one  of  its  principal  components,  and  the  protocols  used  for  con¬ 
trol  and  communication  on  the  bus  are  among  its  principal  mechanisms.  Consequently, 
these  architectures  are  often  referred  to  as  buses  (or  i databases),  although  this  term  un¬ 
derstates  their  complexity,  sophistication,  and  criticality.  In  truth,  these  architectures  are 
the  safety-critical  core  of  the  applications  built  above  them,  and  the  choice  of  services  to 
provide  to  those  applications,  and  the  mechanisms  of  their  implementation,  are  issues 
of  major  importance  in  the  construction  and  certification  of  safety-critical  embedded 
systems. 

In  this  paper,  I  survey  some  of  the  issues  in  the  design  of  bus  architectures  for  safety- 
critical  embedded  systems.  I  hope  this  will  prove  useful  to  potential  users  of  these  ar¬ 
chitectures  and  will  alert  others  to  the  benefits  of  building  on  such  well-considered 
foundations.  My  presentation  is  derived  from  a  review  of  four  representative  architec¬ 
tures;  two  of  these  were  primarily  designed  for  aircraft  applications  and  two  for  auto¬ 
mobiles.  The  economies  of  scale  make  the  automobile  buses  quite  inexpensive — which 
then  renders  them  attractive  in  certain  aircraft  applications.  The  aircraft  buses  consid¬ 
ered  are  the  Honeywell  SAFEbus  [1,7]  (the  backplane  dat  bus  used  in  the  Boeing  777 
Airplane  Information  Management  System)  and  the  NASA  SPIDER  [11]  (an  architec¬ 
ture  being  developed  as  a  demonstrator  for  certification  under  the  new  D0254  guide¬ 
lines  [15]);  the  automobile  buses  considered  are  the  TTTech  Time-Triggered  Architec¬ 
ture  (TTA)  [8,24],  recently  adopted  by  Audi  and  Volkswagen  for  automobile  applica¬ 
tions,  and  by  Honeywell  for  avionics  and  aircraft  controls  functions,  and  FlexRay  [3], 
which  is  being  developed  by  a  consortium  of  BMW,  DaimlerChrysler,  Motorola,  and 
Philips.  A  detailed  comparison  of  these  four  architectures,  along  with  more  extended 
discussion  of  the  issues,  is  available  as  a  technical  report  [17]. 

The  paper  is  organized  as  follows;  Section  2  examines  general  issues  in  time- 
triggered  systems  and  bus  architectures.  Section  3  examines  the  fault  hypotheses  under 
which  they  operate,  and  Section  4  describes  the  services  that  they  provide;  Section  5 
briefly  describes  the  four  representative  architectures,  and  conclusions  are  provided  in 
Section  6. 
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Fig.  1.  Bus  Interconnect 


2  Time-Triggered  Buses 

The  architectures  considered  here  are  called  “buses”  because  multicast  or  broadcast 
communication  is  one  of  the  services  that  they  provide,  and  their  implementations  are 
based  on  a  logical  or  physical  bus.  In  a  generic  bus  architecture,  application  programs 
run  in  host  computers,  and  sensors  and  actuators  are  also  connected  to  the  hosts;  an 
interconnect  medium  provides  broadcast  communications,  and  interface  devices  con¬ 
nect  the  hosts  to  the  interconnect.  The  interfaces  and  interconnect  comprise  the  bus; 
the  combination  of  a  host  and  its  interface(s)  is  referred  to  as  a  node.  Realizations  of 
the  interconnect  may  be  a  physical  (passive)  bus,  as  shown  in  Figure  1,  or  a  centralized 
(active)  hub,  as  shown  in  Figure  2.  The  interfaces  may  be  physically  proximate  to  the 
hosts,  or  they  may  form  part  of  a  more  complex  central  hub.  Many  of  the  components 
will  be  replicated  for  fault  tolerance. 

All  four  of  the  buses  considered  here  are  primarily  time  triggered ;  this  is  a  fun¬ 
damental  design  choice  that  influences  many  aspects  of  their  architectures  and  mech¬ 
anisms,  and  sets  them  apart  from  fundamentally  event-triggered  buses  such  as  Byte- 
flight,  CAN,  Ethernet,  LonWorks,  or  Profibus.  The  time-triggered  and  event-triggered 
approaches  to  systems  design  find  favor  in  different  application  areas,  and  each  has 
strong  advocates;  for  integrated,  safety-critical  systems,  however,  the  time-triggered 
approach  is  generally  preferred.  “Time  triggered”  means  that  all  activities  involving  the 
bus,  and  often  those  involving  components  attached  to  the  bus,  are  driven  by  the  passage 
of  time  (“if  it  is  20  ms  since  the  start  of  the  frame,  then  read  the  sensor  and  broadcast 
its  value”);  this  is  distinguished  from  “event  triggered,”  which  means  that  activities  are 
driven  by  the  occurrence  of  events  (“if  the  sensor  reading  changes,  then  broadcast  its 
new  value”).  A  prime  contrast  between  these  two  approaches  is  their  locus  of  control: 
a  time-triggered  system  controls  its  own  activity  and  interacts  with  the  environment  ac¬ 
cording  to  an  internal  schedule,  whereas  an  event-triggered  system  is  under  the  control 
of  its  environment  and  must  respond  to  stimuli  as  they  occur. 
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Fig.  2.  Star  Interconnect 


Event-triggered  systems  allow  flexible  allocation  of  resources  and  this  is  attractive 
when  demands  are  highly  variable.  However,  in  safety-critical  applications  it  is  neces¬ 
sary  to  guarantee  some  basic  quality  of  service  to  all  participants,  even  (or  especially) 
in  the  presence  of  faults.  Because  the  clients  of  the  bus  architecture  are  real-time  em¬ 
bedded  control  systems,  the  required  guarantees  include  predictable  communications 
with  low  latency  and  low  jitter  (assured  bandwidth  is  not  enough).  The  problem  with 
event-driven  buses  is  that  events  arriving  at  different  nodes  may  cause  them  to  contend 
for  access  to  the  bus,  so  some  form  of  media  access  control  (i.e.,  a  distributed  mutual 
exclusion  algorithm)  is  needed  to  ensure  that  each  node  eventually  is  able  to  transmit 
without  interruption.  The  important  issue  is  how  predictable  is  the  access  achieved  by 
each  node,  and  how  strong  is  the  assurance  that  the  predictions  remain  true  in  the  pres¬ 
ence  of  faults. 

Buses  such  as  Ethernet  resolve  contention  probabilistically  and  therefore  can  pro¬ 
vide  only  probabilistic  guarantees  of  timely  access,  and  no  assurance  at  all  in  the  pres¬ 
ence  of  faults.  Buses  for  non-safety-critical  embedded  systems  such  as  CAN,  Lon- 
Works,  or  Profibus  use  various  priority,  preassigned  slot,  or  token  schemes  to  resolve 
contention  deterministically.  In  CAN,  for  example,  the  message  with  the  lowest  num¬ 
ber  always  wins  the  arbitration  and  therefore  has  to  wait  only  for  the  current  message 
to  finish,  while  other  messages  must  also  wait  for  any  lower-numbered  messages.  Thus, 
although  contention  is  resolved  deterministically,  latency  increases  with  load  and  can  be 
bounded  with  only  probabilistic  guarantees — and  these  can  be  quite  weak  in  the  pres¬ 
ence  of  faults  (e.g.,  the  current  message  may  be  retransmitted  in  the  case  of  transmission 
failure,  thereby  delaying  the  next  message,  even  if  this  has  higher  priority).  Further¬ 
more,  faulty  nodes  may  not  adhere  to  expected  patterns  of  use  and  may  make  excessive 
demands  for  service,  thereby  reducing  that  available  to  others.  Event-triggered  buses 
for  safety-critical  applications  add  various  mechanisms  to  limit  such  demands.  ARINC 
629  (an  avionics  data  bus  used  in  the  Boeing  777),  for  example,  uses  a  technique  some- 
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times  referred  to  as  “minislotting”  that  requires  each  node  to  wait  a  certain  period  after 
sending  a  message  before  it  can  contend  to  send  another.  Even  here,  however,  latency  is 
a  function  of  load,  so  the  Byteflight  automobile  protocol  developed  by  BMW  extends 
this  mechanism  with  guaranteed,  preallocated  slots  for  critical  messages.  But  even  pre¬ 
allocated  slots  provide  no  protection  against  a  faulty  node  that  fails  to  recognize  them. 
The  worst  manifestation  of  this  kind  of  fault  is  the  so-called  “babbling  idiot”  failure 
where  a  faulty  node  transmits  constantly,  thereby  compromising  the  operation  of  the 
entire  bus. 

In  a  time-triggered  bus,  there  is  a  static  preallocation  of  communication  bandwidth 
in  the  form  of  a  global  schedule:  each  node  knows  the  schedule  and  knows  the  time, 
and  therefore  knows  when  it  is  allowed  to  send  messages,  and  when  it  should  expect 
to  receive  them.  Thus,  contention  is  resolved  at  design  time  (as  the  schedule  is  con¬ 
structed),  when  all  its  consequences  can  be  examined,  rather  than  at  run  time.  A  static 
schedule  makes  possible  the  control  of  the  babbling  idiot  failure  mode.  This  is  achieved 
by  interposing  an  independent  component,  called  a  bus  guardian,  that  allows  each  node 
to  transmit  on  the  bus  only  when  it  is  allowed  to  do  so.  The  guardian  must  know  when 
its  node  is  allowed  to  access  the  bus,  which  is  difficult  to  achieve  in  an  event-triggered 
system  but  is  conceptually  simple  in  a  time  triggered  system:  the  guardian  has  an  in¬ 
dependent  clock  and  independent  knowledge  of  the  schedule  and  allows  its  node  to 
broadcast  only  when  indicated  by  the  schedule. 

Because  all  communication  is  triggered  by  the  global  schedule,  there  is  no  need 
to  attach  source  or  destination  addresses  to  messages  sent  over  a  time-triggered  bus: 
each  node  knows  the  sender  and  intended  recipients  of  each  message  by  virtue  of  the 
time  at  which  it  is  sent.  Elimination  of  the  address  fields  not  only  reduces  the  size  of 
each  message,  thereby  greatly  increasing  the  message  bandwidth  of  the  bus  (messages 
are  typically  short  in  embedded  control  applications),  but  it  also  eliminates  a  potential 
source  of  serious  faults:  namely,  the  possibility  that  a  faulty  node  may  send  messages 
to  the  wrong  recipients  or,  worse,  may  masquerade  as  a  sender  other  than  itself. 

Fault-tolerant  clock  synchronization  is  a  fundamental  requirement  for  a  time- 
triggered  bus  architecture:  the  abstraction  of  a  global  clock  is  realized  by  each  node 
having  a  local  clock  that  is  closely  synchronized  with  the  clocks  of  all  other  nodes. 
Tightness  of  the  bus  schedule,  and  hence  the  throughput  of  the  bus,  is  strongly  related 
to  the  quality  of  global  clock  synchronization  that  can  be  achieved — and  this  is  related 
to  the  quality  of  the  clock  oscillators  local  to  each  node,  and  to  the  algorithm  used  to 
synchronize  them.  There  are  two  basic  classes  of  algorithm  for  clock  synchronization: 
those  based  on  averaging  and  those  based  on  events.  Averaging  works  by  each  node 
measuring  the  skew  between  its  clock  and  that  of  each  other  node  (e.g.,  by  comparing 
the  arrival  time  of  each  message  with  its  expected  value)  then  setting  its  clock  to  some 
“average”  value.  A  simple  average  (e.g.,  the  mean  or  median)  over  all  clocks  may  be  af¬ 
fected  by  wild  readings  from  faulty  clocks  (which  might  provide  different,  or  missing, 
readings  to  different  observers),  so  we  need  a  “fault-tolerant  average”  that  is  largely 
insensitive  to  a  certain  number  of  readings  from  faulty  clocks.  Schneider  [18]  gives 
a  general  description  that  applies  to  all  averaging  clock  synchronization  algorithms; 
these  algorithms  differ  only  in  their  choice  of  fault-tolerant  average.  The  Welch-Lynch 
algorithm  [25]  is  a  popular  choice  that  is  characterized  by  use  of  the  “fault-tolerant 
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midpoint”  as  its  averaging  function.  Event-based  algorithms  rely  on  nodes  being  able 
to  sense  events  directly  on  the  interconnect:  each  node  broadcasts  a  “ready”  event  when 
it  is  time  to  synchronize  and  sets  its  clock  when  it  has  seen  a  certain  number  of  events 
from  other  nodes.  Depending  on  the  fault  model,  additional  waves  of  “echo”  or  “ac¬ 
cept”  events  may  be  needed  to  make  this  fault  tolerant.  The  number  of  faulty  nodes  that 
can  be  tolerated,  and  the  quality  of  synchronization  that  can  be  achieved,  depend  on 
the  details  of  the  algorithm,  and  on  the  fault  hypothesis  under  which  it  operates.  The 
event-based  algorithm  of  Srikanth  and  Toueg  [21]  is  particularly  attractive  because  it 
achieves  optimal  accuracy. 

3  Fault  Hypotheses  and  Fault  Containment  Units 

Safety-critical  aerospace  functions  are  generally  required  to  have  failure  rates  less  than 
10-9  per  hour  [5],  and  an  architecture  that  is  intended  to  support  several  such  func¬ 
tions  should  provide  assurance  of  failure  rates  better  than  10_1°  per  hour.  Similar  re¬ 
quirements  apply  to  cars  (although  higher  rates  of  loss  are  accepted  for  individual  cars 
than  aircraft,  there  are  vastly  more  of  them,  so  the  required  failure  rates  are  similar). 
Consumer-grade  electronics  devices  have  failure  rates  many  orders  of  magnitude  worse 
than  this,  so  redundancy  and  fault  tolerance  are  essential  elements  of  a  bus  architecture. 
Redundancy  may  include  replication  of  the  entire  bus,  of  the  interconnect  and/or  the  in¬ 
terfaces,  or  decomposition  of  those  elements  into  smaller  subcomponents  that  are  then 
replicated. 

Fault  tolerance  takes  two  forms  in  these  architectures:  first  is  that  which  ensures  that 
the  bus  itself  does  not  fail,  second  is  that  which  eases  the  construction  of  fault-tolerant 
applications.  Each  of  these  mechanisms  must  be  constructed  and  validated  against  an 
explicit  fault  hypothesis ,  and  must  deliver  specified  services  (that  may  be  specified  to 
degrade  in  acceptable  ways  in  the  presence  of  faults).  The  fault  hypothesis  must  de¬ 
scribe  the  modes  (i.e.,  kinds)  of  faults  that  are  to  be  tolerated,  and  their  maximum 
number  and  arrival  rate.  The  fault  hypothesis  must  also  identify  the  different  fault  con¬ 
tainment  units  (FCUs)  in  the  design:  these  are  the  components  that  can  independently 
be  afflicted  by  faults.  The  division  of  an  architecture  into  separate  FCUs  needs  care¬ 
ful  justification:  there  must  be  no  propagation  of  faults  from  one  FCU  to  another,  and 
no  “common  mode  failures”  where  a  single  physical  event  produces  faults  in  multiple 
FCUs.  Only  physical  faults  (those  caused  by  damage  to,  defects  in,  or  aging  of  the  de¬ 
vices  employed,  or  by  external  disturbances  such  as  cosmic  rays,  and  electromagnetic 
interference)  are  considered  in  this  analysis:  design  faults  must  be  excluded,  and  must 
be  shown  to  be  so  by  stringent  assurance  and  certification  processes. 

The  assumption  that  failures  of  separate  FCUs  are  independent  must  be  ensured  by 
careful  design  and  assured  by  stringent  analysis.  True  independence  generally  requires 
that  different  FCUs  are  served  by  different  power  supplies,  and  are  physically  and  elec¬ 
trically  isolated  from  each  other.  Providing  this  level  of  independence  is  expensive  and 
it  is  generally  undertaken  only  in  aircraft  applications.  In  cars,  it  is  common  to  make 
some  small  compromises  on  independence:  for  example,  the  guardians  may  be  fabri¬ 
cated  on  the  same  chip  as  the  interface  (but  with  their  own  clock  oscillators),  or  the 
interface  may  be  fabricated  on  the  same  chip  as  the  host  processor.  It  is  necessary  to 
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examine  these  compromises  carefully  to  ensure  that  the  loss  in  independence  applies 
only  to  fault  modes  that  are  benign,  extremely  rare,  or  tolerated  by  other  mechanisms. 

A  fault  mode  describes  the  kind  of  behavior  that  a  faulty  FCU  may  exhibit.  The  same 
fault  may  exhibit  different  modes  at  different  levels  of  a  protocol  hierarchy:  for  example, 
at  the  electrical  level,  the  fault  mode  of  a  faulty  line  driver  may  be  that  it  sends  an 
intermediate  voltage  (one  that  is  neither  a  digital  0  nor  a  digital  1),  while  at  the  message 
level  the  mode  of  the  same  fault  may  be  “Byzantine,”  meaning  that  different  receivers 
interpret  the  same  message  in  different  ways  (because  some  see  the  intermediate  voltage 
as  a  0,  and  others  as  a  1).  Some  protocols  can  tolerate  Byzantine  faults,  others  cannot; 
for  those  that  cannot,  we  must  show  that  the  fault  mode  is  controlled  at  the  underlying 
electrical  level. 

The  basic  dimensions  that  a  fault  can  affect  are  value,  time,  and  space.  A  value  fault 
is  one  that  causes  an  incorrect  value  to  be  computed,  transmitted,  or  received  (whether 
as  a  physical  voltage,  a  logical  message,  or  some  other  representation);  a  timing  fault 
is  one  that  causes  a  value  to  be  computed,  transmitted,  or  received  at  the  wrong  time 
(whether  too  early,  too  late,  or  not  at  all);  a  spatial  proximity  fault  is  one  where  all 
matter  in  some  specified  volume  is  destroyed  (potentially  afflicting  multiple  FCUs). 
Bus-based  interconnects  of  the  kind  shown  in  Figure  1  are  vulnerable  to  spatial  prox¬ 
imity  faults:  all  redundant  buses  necessarily  come  into  close  proximity  at  each  node,  and 
general  destruction  in  that  space  could  sever  or  disrupt  them  all.  Interconnect  topolo¬ 
gies  with  a  central  hub  are  far  more  resilient  in  this  regard:  a  spatial  proximity  fault 
that  destroys  one  or  more  nodes  does  not  disrupt  communication  among  the  others  (the 
hub  may  need  to  isolate  the  lines  to  the  destroyed  nodes  in  case  these  are  shorted),  and 
destruction  of  a  hub  can  be  tolerated  if  there  is  a  duplicate  in  another  location. 

There  are  many  ways  to  classify  the  effects  of  faults  in  any  of  the  basic  dimensions. 
One  classification  that  has  proved  particularly  effective  in  analysis  of  the  types  of  al¬ 
gorithms  that  underlie  the  architectures  considered  here  is  the  hybrid  fault  model  of 
Thambidurai  and  Park  [23].  In  this  classification,  the  effect  of  a  fault  may  be  manifest , 
meaning  that  it  is  reliably  detected  (e.g.,  a  fault  that  causes  an  FCU  to  cease  transmitting 
messages),  symmetric  meaning  that  whatever  the  effect,  it  is  the  same  for  all  observers 
(e.g.,  an  off-by-1  error),  or  arbitrary,  meaning  that  it  is  entirely  unconstrained.  In  par¬ 
ticular,  an  arbitrary  fault  may  be  asymmetric  or  Byzantine,  meaning  that  its  effect  is 
perceived  differently  by  different  observers  (as  in  the  intermediate  voltage  example). 

The  great  advantage  to  designs  that  can  tolerate  arbitrary  fault  modes  is  that  we  do 
not  have  to  justify  assumptions  about  more  specific  fault  modes:  a  system  is  shown  to 
tolerate  (say)  two  arbitrary  faults  by  proving  that  it  works  in  the  presence  of  two  faulty 
FCUs  with  no  assumptions  whatsoever  on  the  behavior  of  the  faulty  components.  A 
system  that  can  tolerate  only  specific  fault  modes  may  fail  if  confronted  by  a  different 
fault  mode,  so  it  is  necessary  to  provide  assurance  that  such  modes  cannot  occur.  It  is 
this  absence  of  assumptions  that  is  the  attraction,  in  safety-critical  contexts,  of  systems 
that  can  tolerate  arbitrary  faults.  This  point  is  often  misunderstood  and  such  systems 
are  derided  as  being  focused  on  asymmetric  or  Byzantine  faults,  “which  never  arise  in 
practice.”  Byzantine  faults  are  just  one  manifestation  of  arbitrary  behavior,  and  cannot 
simply  be  asserted  not  to  occur  (in  fact,  they  have  been  observed  in  several  systems 
that  have  been  monitored  sufficiently  closely).  One  situation  that  is  likely  to  provoke 
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asymmetric  manifestations  is  a  slightly  out  of  specification  (SOS)  fault,  such  as  the 
intermediate  electrical  voltage  mentioned  earlier.  SOS  faults  in  the  timing  dimension 
include  those  that  put  a  signal  edge  very  close  to  a  clock  edge,  or  that  have  signals  with 
very  slow  rise  and  fall  times  (i.e.,  weak  edges).  Depending  on  the  timing  of  their  own 
clock  edges,  some  receivers  may  recognize  and  latch  such  a  signal,  others  may  not, 
resulting  in  asymmetric  or  Byzantine  behavior. 

FCUs  may  be  active  (e.g.,  a  processor)  or  passive  (e.g.,  a  bus);  while  an  arbitrary- 
faulty  active  component  can  do  anything,  a  passive  component  may  change,  lose,  or 
delay  data,  but  it  cannot  spontaneously  create  a  new  datum.  Keyed  checksums  or  digital 
signatures  can  sometimes  be  used  to  reduce  the  fault  modes  of  an  active  FCU  to  those 
of  a  passive  one.  (An  arbitrary-faulty  active  FCU  can  always  create  its  own  messages, 
but  it  cannot  create  messages  purporting  to  come  from  another  FCU  if  it  does  not  know 
the  key  of  that  FCU;  signatures  need  to  be  managed  carefully  for  this  reduction  in  fault 
mode  to  be  credible.) 

Any  fault-tolerant  architecture  will  fail  if  subjected  to  too  many  faults;  generally 
speaking,  it  requires  more  redundancy  to  tolerate  an  arbitrary  fault  than  a  symmetric 
one,  which  in  turn  requires  more  redundancy  than  a  manifest  fault.  The  most  effective 
fault-tolerant  algorithms  make  this  tradeoff  automatically  between  number  and  diffi¬ 
culty  of  faults  tolerated.  For  example,  the  clock  synchronization  algorithm  of  [16]  can 
tolerate  a  arbitrary  faults,  s  symmetric,  and  m  manifest  ones  simultaneously  provided 
n,  the  number  of  FCUs,  satisfies  n  >  3a  +  2s  +  m.  It  is  provably  impossible  (i.e.,  it  can 
be  proven  that  no  algorithm  can  exist)  to  tolerate  a  arbitrary  faults  in  clock  synchro¬ 
nization  with  fewer  than  3a  +  1  FCUs  (unless  digital  signatures  are  employed — which 
is  equivalent  to  reducing  the  severity  of  the  arbitrary  fault  mode). 

Because  it  is  algorithmically  much  easier  to  tolerate  simple  failure  modes,  some 
architectures  (e.g.,  SAFEbus)  arrange  FCUs  (the  “Bus  Interface  Units”  in  the  case  of 
SAFEbus)  in  self-checking  pairs:  if  the  members  of  a  pair  disagree,  they  go  offline, 
ensuring  that  the  effect  of  their  failure  is  seen  as  a  manifest  fault  (i.e.,  one  that  is  easily 
tolerated).  Most  architectures  also  employ  substantial  self-checking  in  each  FCU;  any 
FCU  that  detects  a  fault  will  shut  down,  thereby  ensuring  that  its  failure  will  be  manifest. 
(This  kind  of  operation  is  often  called  fail  silence).  Even  with  extensive  self-checking 
and  pairwise-checking,  it  may  be  possible  for  some  fault  modes  to  “escape,”  so  it  is 
generally  necessary  to  show  either  that  the  mechanisms  used  have  complete  coverage 
(i.e.,  there  will  be  no  violation  of  fail  silence),  or  to  design  the  architecture  so  that  it  can 
tolerate  the  “escape”  of  at  least  one  arbitrary  fault. 

Some  architectures  can  tolerate  only  a  single  fault  at  a  time,  but  can  reconfigure  to 
exclude  faulty  FCUs  and  are  then  able  to  tolerate  additional  faults.  In  such  cases,  the 
fault  arrival  rate  is  important:  faults  must  not  arrive  faster  than  the  architecture  can 
reconfigure.  The  architectures  considered  here  operate  according  to  static  schedules, 
which  consist  of  “rounds”  or  “frames”  that  are  executed  repeatedly  in  a  cyclic  fashion. 
The  acceptable  fault  arrival  rate  is  often  then  expressed  in  terms  of  faults  per  round 
(or  the  inverse).  It  is  usually  important  that  every  node  is  scheduled  to  make  at  least 
one  broadcast  in  every  round,  since  this  is  how  fault  status  is  indicated  (and  hence  how 
reconfiguration  is  triggered). 
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Historical  experience  and  analysis  must  be  used  to  show  that  the  hypothesized 
modes,  numbers,  and  arrival  rate  are  realistic,  and  that  the  architecture  can  indeed  oper¬ 
ate  correctly  under  those  hypotheses  for  its  intended  mission  time.  But  sometimes  things 
go  wrong:  the  system  may  experience  many  simultaneous  faults  (e.g.,  from  unantici¬ 
pated  high-intensity  radiated  fields  (HIRF)),  or  other  violations  of  its  fault  hypothesis. 
We  cannot  guarantee  correct  operation  in  such  cases  (otherwise  our  fault  hypothesis 
was  too  conservative),  but  safety-critical  systems  generally  are  constructed  to  a  “never 
give  up’’  philosophy  and  will  attempt  to  continue  operation  in  a  degraded  mode.  The 
usual  method  of  operation  in  “never  give  up”  mode  is  that  each  node  reverts  to  local 
control  of  its  own  actuators  using  the  best  information  available  (e.g.,  each  brake  node 
applies  braking  force  proportional  to  pedal  pressure  if  it  is  still  receiving  that  input,  and 
removes  all  braking  force  if  not),  while  at  the  same  time  attempting  to  regain  coordi¬ 
nation  with  its  peers.  Although  it  is  difficult  to  provide  assurance  of  correct  operation 
during  these  upsets,  it  may  be  possible  to  provide  assurance  that  the  system  returns  to 
normal  operation  once  the  faults  cease  (assuming  they  were  transients)  using  the  ideas 
of  self-stabilization  [20]. 

Restart  during  operation  may  be  necessary  if  HIRF  or  other  environmental  influ¬ 
ences  lead  to  violation  of  the  fault  hypothesis  and  cause  a  complete  failure  of  the  bus. 
Notice  that  this  failure  must  be  detected  by  the  bus,  and  the  restart  must  be  automatic 
and  very  fast:  most  control  systems  can  tolerate  loss  of  control  inputs  for  only  a  few 
cycles — longer  outages  will  lead  to  loss  of  control.  For  example,  Heiner  and  Thurner 
estimate  that  the  maximum  transient  outage  time  for  a  steer-by-wire  automobile  appli¬ 
cation  is  50ms  [6], 

Restart  is  usually  initiated  when  an  interface  detects  no  activity  on  any  bus  line  for 
some  interval;  that  interface  will  then  transmit  some  “wake  up”  message  on  all  lines.  Of 
course,  it  is  possible  that  the  interface  in  question  is  faulty  (and  there  was  bus  activity  all 
along  but  that  interface  did  not  detect  it),  or  that  two  interfaces  decide  simultaneously 
to  send  the  “wake  up”  call.  The  first  possibility  must  be  avoided  by  careful  checking, 
preferably  by  independent  units  (e.g.,  both  interfaces  of  a  pair,  or  an  interface  and  its 
guardian);  the  second  requires  some  form  of  collision  detection  and  resolution:  this 
should  be  deterministic  to  guarantee  an  upper  bound  on  the  time  to  reach  resolution  (that 
will  allow  a  single  interface  can  send  an  uninterrupted  “wake  up”  message)  and,  ideally, 
should  not  depend  on  collision  detection  (because  this  cannot  be  done  reliably).  Notice 
that  it  must  be  possible  to  perform  startup  and  restart  reliably  even  in  the  presence  of 
faulty  components. 

4  Services 

The  essential  basic  purpose  of  these  bus  architectures  is  to  make  it  possible  to  build 
reliable  distributed  applications;  a  desirable  purpose  is  to  make  it  straightforward  to 
build  such  applications.  The  basic  services  provided  by  the  bus  architectures  consid¬ 
ered  here  comprise  clock  synchronization,  time-triggered  activation,  and  reliable  mes¬ 
sage  delivery.  Some  of  the  architectures  provide  additional  services;  their  purpose  is 
to  assist  straightforward  construction  of  reliable  distributed  applications  by  providing 
these  services  in  an  application-independent  manner,  thereby  relieving  the  applications 
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of  the  need  to  implement  these  capabilities  themselves.  Not  only  does  this  simplify  the 
construction  of  application  software,  it  is  sometimes  possible  to  provide  better  services 
when  these  are  implemented  at  the  architecture  level,  and  it  is  also  possible  to  provide 
strong  assurance  that  they  are  implemented  correctly. 

Applications  that  perform  safety-critical  functions  must  generally  be  replicated  for 
fault  tolerance.  There  are  many  ways  to  organize  fault-tolerant  replicated  computations, 
but  a  basic  distinction  is  between  those  that  use  exact  agreement,  and  those  that  use  ap¬ 
proximate  agreement.  Systems  that  use  approximate  agreement  generally  run  several 
copies  of  the  application  in  different  nodes,  each  using  its  own  sensors,  with  little  co¬ 
ordination  across  the  different  nodes.  The  motivation  for  this  is  a  “folk  belief"  that  it 
promotes  fault  tolerance:  coordination  is  believed  to  introduce  the  potential  for  com¬ 
mon  mode  failures.  Because  different  sensors  cannot  be  expected  to  deliver  exactly  the 
same  readings,  the  outputs  (i.e.,  actuator  commands)  computed  in  the  different  nodes 
will  also  differ.  Thus,  the  only  way  to  detect  faulty  outputs  is  by  looking  for  values  that 
differ  by  “a  lot”  from  the  others.  Hence,  these  systems  use  some  form  of  selection  or 
threshold  voting  to  select  a  good  value  to  send  to  the  actuators,  and  similar  techniques 
to  identify  faulty  nodes  that  should  be  excluded  from  the  configuration.  A  difficulty 
for  applications  of  the  kind  considered  here  is  that  hosts  accumulate  state  that  diverges 
from  that  of  others  over  time  (e.g.,  velocity  and  position  as  a  result  of  integrating  ac¬ 
celeration),  and  they  execute  mode  switches  that  are  discrete  decisions  based  on  local 
sensor  values  (e.g.,  change  the  gain  schedule  in  the  control  laws  if  the  altitude,  or  tem¬ 
perature,  is  above  a  specific  value).  Thus  small  differences  in  sensor  readings  can  lead 
to  major  differences  in  outputs  and  this  can  mislead  the  approximate  selection  or  vot¬ 
ing  mechanisms  into  choosing  a  faulty  value,  or  excluding  a  nonfaulty  node.  The  fix  to 
these  problems  is  to  attempt  to  coordinate  discrete  mode  switches  and  periodically  to 
bring  state  data  into  convergence.  But  these  fixes  are  highly  application  specific,  and 
they  are  contrary  to  the  original  philosophy  that  motivated  the  choice  of  approximate 
agreement — hence,  there  is  a  good  chance  of  doing  them  wrong.  There  are  numerous 
examples  that  justify  this  concern;  several  that  were  discovered  in  flight  tests  are  docu¬ 
mented  by  Mackall  and  colleagues  [10].  The  essential  points  of  Mackall’s  data  is  that  all 
the  failures  observed  in  flight  test  were  due  to  bugs  in  the  design  of  the  fault  tolerance 
mechanisms  themselves,  and  all  these  bugs  could  be  traced  to  difficulties  in  organizing 
and  coordinating  systems  based  on  approximate  agreement. 

Systems  based  on  exact  agreement  face  up  to  the  fact  that  coordination  among  repli¬ 
cated  computations  is  necessary,  and  they  take  the  necessary  steps  to  do  it  right.  If  we 
are  to  use  exact  agreement,  then  every  replica  must  perform  the  same  computation  on 
the  same  data:  any  disagreement  on  the  outputs  then  indicates  a  fault;  comparison  can 
be  used  to  detect  those  faults,  and  majority  voting  to  mask  them.  A  vital  element  in 
this  approach  to  fault  tolerance  is  that  replicated  components  must  work  on  the  same 
data:  thus,  if  one  node  reads  a  sensor,  it  must  distribute  that  reading  to  all  the  redundant 
copies  of  the  application  running  in  other  nodes.  Now  a  fault  in  that  distribution  mecha¬ 
nism  could  result  in  one  node  getting  one  value  and  another  a  different  one  (or  no  value 
at  all).  This  would  abrogate  the  requirement  that  all  replicas  obtain  identical  inputs,  so 
we  need  to  employ  mechanisms  to  overcome  this  behavior. 
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The  problem  of  distributing  data  consistently  in  the  presence  of  faults  is  variously 
called  interactive  consistency,  consensus,  atomic  broadcast,  or  Byzantine  agreement 
[9, 12],  When  a  node  transmits  a  message  to  several  receivers,  interactive  consistency 
requires  the  following  two  properties  to  hold. 

Agreement:  All  nonfaulty  receivers  obtain  the  same  message  (even  if  the  transmitting 
node  is  faulty). 

Validity:  If  the  transmitter  is  nonfaulty,  then  nonfaulty  receivers  obtain  the  message 
actually  sent. 

Algorithms  for  achieving  these  requirements  in  the  presence  of  arbitrary  faults  necessar¬ 
ily  involve  more  than  a  single  data  exchange  (basically,  each  receiver  must  compare  the 
value  it  received  against  those  received  by  others).  It  is  provably  impossible  to  achieve 
interactive  consistency  in  the  presence  of  a  arbitrary  faults  unless  there  are  at  least 
3a  +  1  FCUs,  2a  +  1  disjoint  communication  paths  between  them,  and  a  +  1  levels  (or 
“rounds”)  of  communication.  The  number  of  FCUs  and  the  number  of  disjoint  paths 
required,  but  not  the  number  of  rounds,  can  be  reduced  by  using  digital  signatures. 

The  problem  might  seem  moot  in  architectures  that  employ  a  physical  bus,  since  a 
bus  surely  cannot  deliver  values  inconsistently  (so  the  agreement  property  is  achieved 
trivially).  Unfortunately,  it  can — though  it  is  likely  to  be  a  very  rare  event.  The  scenarios 
involving  SOS  faults  presented  earlier  exemplify  some  possibilities. 

Dealing  properly  with  very  rare  events  is  one  of  the  attributes  that  distinguishes  a 
design  that  is  fit  for  safety-critical  systems  from  one  that  is  not.  It  follows  that  either 
the  application  software  must  perform  interactive  consistency  for  itself  (incurring  the 
cost  of  n2  messages  to  establish  consistency  across  n  nodes  in  the  presence  of  a  single 
arbitrary  fault),  or  the  bus  architecture  must  do  it. 

The  first  choice  is  so  unattractive  that  it  vitiates  the  whole  purpose  of  a  fault-tolerant 
bus  architecture.  Most  bus  architectures  therefore  provide  some  type  of  interactively 
consistent  message  broadcast  as  a  basic  service.  In  addition,  most  architectures  take 
steps  to  reduce  the  incidence  of  asymmetric  transmissions  (i.e.,  those  that  appear  as  one 
value  to  some  receivers,  and  as  different  values,  or  the  absence  of  values,  to  others).  As 
noted,  SOS  faults  are  among  the  most  plausible  sources  of  asymmetric  transmissions. 
SOS  faults  that  cause  asymmetric  transmissions  can  arise  in  either  the  value  or  time  do¬ 
mains  (e.g.,  intermediate  voltages,  or  weak  edges,  respectively).  In  those  architectures 
that  employ  a  bus  guardian  in  a  central  hub  or  “in  series”  with  each  interface,  the  bus 
guardians  are  a  possible  point  of  intervention  for  the  control  of  SOS  faults:  a  suitable 
guardian  can  reshape,  in  both  value  and  time  domains,  the  signal  sent  to  it  by  the  con¬ 
troller.  Of  course,  the  guardian  could  be  faulty  and  may  make  matters  worse — so  this 
approach  makes  sense  only  when  there  are  independent  guardians  on  each  of  two  (or 
more)  replicated  interconnects.  Observe  that  for  credible  signal  reshaping,  the  guardian 
must  have  a  power  supply  that  is  independent  of  that  of  the  controller  (faults  in  power 
supply  are  the  most  likely  cause  of  intermediate  voltages  and  weak  edges). 

Interactively  consistent  message  broadcast  provides  the  foundation  for  fault  toler¬ 
ance  based  on  exact  agreement.  There  are  several  ways  to  use  this  foundation.  One 
arrangement,  confusingly  called  the  state  machine  approach  [19],  is  based  on  majority 
voting:  application  replicas  run  on  a  number  of  different  nodes,  exchange  their  output 
values,  and  deliver  a  majority  vote  to  the  actuators. 
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Another  arrangement  is  based  on  self-checking  (either  by  individuals  or  pairs)  so 
that  faults  result  in  fail-silence.  This  will  be  detected  by  other  nodes,  and  some  backup 
application  running  in  those  other  nodes  can  take  over.  The  architecture  can  assist  this 
master/shadow  arrangement  by  providing  services  that  support  the  rollover  from  one 
node  to  another.  One  such  service  automatically  substitutes  a  backup  node  for  a  failed 
master  (both  the  master  and  the  backup  occupy  the  same  slot  in  the  schedule,  but  the 
backup  is  inhibited  from  transmitting  unless  the  master  is  failed).  A  variant  has  both 
master  and  backup  operating  in  different  slots,  but  the  backup  inhibits  itself  unless 
it  is  informed  that  the  master  has  failed.  A  further  variation,  called  compensation, 
applies  when  different  nodes  have  access  to  different  actuators:  none  is  a  direct  backup 
to  any  other,  but  each  changes  its  operation  when  informed  that  others  have  failed  (an 
example  is  car  braking:  separate  nodes  controlling  the  braking  force  at  each  wheel  will 
redistribute  the  force  when  informed  that  one  of  their  number  has  failed). 

The  variations  on  master/shadow  described  above  all  depend  on  a  “failure  notifica¬ 
tion,”  or  equivalently  a  “membership”  service.  The  crucial  requirement  on  such  a  ser¬ 
vice  is  that  it  must  produce  consistent  knowledge:  that  is,  if  one  nonfaulty  node  thinks 
that  a  particular  node  has  failed,  then  all  other  nonfaulty  nodes  must  hold  the  same 
opinion — otherwise,  the  system  will  lose  coordination,  with  potentially  catastrophic 
results  (e.g.,  if  the  nodes  controlling  braking  at  different  wheels  make  different  adjust¬ 
ments  to  their  braking  force  based  on  different  assessments  of  which  others  have  failed). 
Notice  that  this  must  also  apply  to  a  node’s  knowledge  of  its  own  status:  a  naive  view 
might  assume  that  a  node  that  is  receiving  messages  and  seeing  no  problems  in  its  own 
operation  should  assume  it  is  in  the  membership.  But  if  this  node  is  unable  to  transmit, 
all  other  nodes  will  have  removed  it  from  their  memberships  and  will  be  making  suit¬ 
able  compensation  on  the  assumption  that  this  node  has  entered  its  “blackout”  mode 
(and  is,  for  example,  applying  no  force  to  its  brake).  It  could  be  catastrophic  if  this  node 
does  not  adopt  the  consensus  view  and  continues  operation  (e.g.,  applying  force  to  its 
brake)  based  on  its  local  assessment  of  its  own  health. 

A  membership  service  operates  as  follows.  Each  node  maintains  a  private  member¬ 
ship  list,  which  is  intended  to  comprise  all  and  only  the  nonfaulty  nodes.  Since  it  can 
take  a  while  to  diagnose  a  faulty  node,  we  have  to  allow  the  common  membership  to 
contain  at  most  one  faulty  node.  Thus,  a  membership  service  must  satisfy  the  following 
two  requirements. 

Agreement:  The  membership  lists  of  all  nonfaulty  nodes  are  the  same. 

Validity:  The  membership  lists  of  all  nonfaulty  nodes  contain  all  nonfaulty  nodes  and 
at  most  one  faulty  node. 

These  requirements  can  be  achieved  only  under  benign  fault  hypotheses  (it  is  prov- 
ably  impossible  to  diagnose  an  arbitrary-faulty  node  with  certainty).  When  unable  to 
maintain  accurate  membership,  the  best  recourse  is  to  maintain  agreement,  but  sacrifice 
validity  (nonfaulty  nodes  that  are  not  in  the  membership  can  then  attempt  to  rejoin). 
This  weakened  requirement  is  called  “clique  avoidance”  [2], 

Note  that  it  is  quite  simple  to  achieve  consistent  membership  on  top  of  an  inter¬ 
actively  consistent  message  service:  each  node  broadcasts  its  own  membership  list  to 
every  other  node,  and  each  node  runs  a  deterministic  resolution  algorithm  on  the  (iden¬ 
tical,  by  interactive  consistency)  lists  received.  Conversely,  a  membership  and  clique- 
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avoidance  service  can  assist  the  construction  of  an  interactively  consistent  message  ser¬ 
vice:  simply  exclude  from  the  membership  any  node  that  receives  a  message  different 
than  the  majority  (TTA  does  this). 

5  Practical  Implementations 

Here,  we  provide  sketches  of  four  bus  architectures  that  provide  concrete  solutions  to 
the  requirements  and  design  challenges  outlined  in  the  previous  sections.  More  details 
are  available  in  a  companion  report  to  this  paper  [17].  All  four  buses  support  the  time- 
triggered  model  of  computation,  employ  fault-tolerant  distributed  clock  synchroniza¬ 
tion,  and  use  bus  guardians  or  some  equivalent  mechanism  to  protect  against  babbling 
idiot  failure  modes.  They  differ  in  their  fault  hypotheses,  mechanisms  employed,  ser¬ 
vices  provided,  and  in  their  assurance,  performance,  and  cost. 

SAFEbus.  Honeywell  developed  SAFEbus™  (the  principal  designers  are  Kevin 
Driscoll  and  Ken  Hoyme  [7])  to  serve  as  the  core  of  the  Boeing  777  Airplane  Infor¬ 
mation  Management  System  (AIMS)  [22],  which  supports  several  critical  functions, 
such  as  cockpit  displays  and  airplane  data  gateways.  The  bus  has  been  standardized  as 
ARINC  659  [1]  and  variations  on  Honeywell’s  implementation  are  being  used  or  con¬ 
sidered  for  other  avionics  and  space  applications.  It  uses  a  bus  interconnect  similar  to 
that  shown  in  Figure  1;  the  interfaces  (they  are  called  Bus  Interface  Units,  or  BIUs) 
are  duplicated,  and  the  interconnect  bus  is  quad-redundant.  Most  of  the  functionality  of 
SAFEbus  is  implemented  in  the  BIUs,  which  perform  clock  synchronization  and  mes¬ 
sage  scheduling  and  transmission  functions.  Each  BIU  of  a  pair  is  a  separate  FCU  and 
acts  as  its  partner’s  bus  guardian  by  controlling  its  access  to  the  interconnect. 

Each  BIU  of  a  pair  drives  a  different  pair  of  interconnect  buses  but  is  able  to  read 
all  four;  the  interconnect  buses  themselves  each  comprise  two  data  lines  and  one  clock 
line  and  operate  at  30MHz.  The  bus  lines  and  their  drivers  have  the  electrical  char¬ 
acteristics  of  OR-gates  (i.e.,  if  several  different  BIUs  drive  the  same  line  at  the  same 
time,  the  resulting  signal  is  the  OR  of  the  separate  inputs).  Some  of  the  protocols  ex¬ 
ploit  this  property;  in  particular,  clock  synchronization  is  achieved  using  an  event-based 
algorithm. 

The  paired  BIUs  at  sender  and  receiver,  and  the  quad-redundant  buses,  provide 
sufficient  redundancy  for  SAFEbus  to  provide  interactively  consistent  message  broad¬ 
casts  (in  the  Honeywell  implementation)  using  an  approach  similar  to  that  described 
by  Davies  and  Wakerly  [4]  (this  remarkably  prescient  paper  anticipated  many  of  the 
issues  and  solutions  in  Byzantine  fault  tolerance  by  several  years).  It  also  supports 
application-level  fault  tolerance  (based  on  self-checking  pairs)  by  providing  automatic 
rapid  rollover  from  masters  to  shadows. 

Its  fault  hypothesis  includes  arbitrary  faults,  faults  in  several  nodes  (but  only  one 
per  node),  and  a  high  rate  of  fault  arrivals.  It  never  gives  up  and  has  a  well-defined 
restart  and  recovery  strategy  from  fault  arrivals  that  exceed  its  fault  hypothesis.  It  toler¬ 
ates  spatial  proximity  faults  in  the  AIMS  application  by  duplicating  the  entire  system. 
SAFEbus  is  certified  for  use  in  passenger  aircraft  and  has  extensive  field  experience  in 
the  Boeing  777.  The  Honeywell  implementation  is  supported  by  an  in-house  tool  chain. 
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SAFEbus  is  the  most  mature  of  the  four  buses  considered,  and  makes  the  fewest 
compromises.  But  because  each  of  its  major  components  is  paired  (and  its  bus  requires 
separate  lines  for  clock  and  data),  it  is  the  most  expensive  of  those  available  for  com¬ 
mercial  use  (typically,  a  few  hundred  dollars  per  node). 

TTA.  The  Time  Triggered  Architecture  (TTA)  was  developed  by  Hermann  Kopetz 
and  colleagues  at  the  Technical  University  of  Vienna  [8],  Commercial  development  of 
the  architecture  is  undertaken  by  TTTech  and  it  is  being  deployed  for  safety-critical 
applications  in  cars  by  Audi  and  Volkswagen,  and  for  flight-critical  functions  in  aircraft 
and  aircraft  engines  by  Honeywell. 

Current  implementations  of  TTA  use  a  bus  interconnect  similar  to  that  shown  in  Fig¬ 
ure  1.  The  interfaces  (they  are  called  controllers )  implement  the  TTP/C  protocol  [24] 
that  is  at  the  heart  of  TTA,  providing  clock  synchronization,  and  message  sequenc¬ 
ing  and  transmission  functions.  The  interconnect  bus  is  duplicated  and  each  controller 
drives  both  of  them  through  partially  independent  bus  guardians.  TTA  uses  an  averaging 
clock  synchronization  algorithm  based  on  that  of  Lundelius  and  Lynch  [25].  This  algo¬ 
rithm  is  implemented  in  the  controllers,  but  requires  too  many  resources  to  be  replicated 
in  the  bus  guardians.  The  guardians,  which  have  independent  clocks,  therefore  rely  on 
their  controllers  for  a  “start  of  frame”  signal.  This  compromises  their  independence 
somewhat  (they  also  share  the  power  supply  and  some  other  resources  with  their  con¬ 
trollers),  so  forthcoming  implementations  of  TTA  use  a  star  interconnect  similar  to  that 
shown  in  Figure  2.  Here,  the  guardian  functionality  is  implemented  in  the  central  hub 
which  is  fully  independent  of  the  controllers:  the  hubs  and  controllers  comprise  separate 
FCUs  in  this  implementation.  Hubs  are  duplicated  for  fault  tolerance  and  located  apart 
to  withstand  spatial  proximity  faults.  They  also  perform  signal  reshaping  to  reduce  the 
incidence  of  SOS  faults. 

TTA  employs  algorithms  for  group  membership  and  clique  avoidance  [2];  these 
enable  its  clock  synchronization  algorithm  to  tolerate  multiple  faults  (by  reconfigur¬ 
ing  to  exclude  faulty  members)  and  combine  with  its  use  of  checksums  (which  can  be 
considered  as  digital  signatures)  to  provide  a  form  of  interactively  consistent  message 
broadcasts.  The  membership  service  supports  application-level  fault  tolerance  based  on 
master-backup  or  compensation.  Proposed  extensions  provide  state  machine  replication 
in  a  manner  that  is  transparent  to  applications. 

The  fault  hypothesis  of  TTA  includes  arbitrary  faults,  and  faults  in  several  nodes 
(but  only  one  per  node),  provided  these  arrive  at  least  two  rounds  apart  (this  allows  the 
membership  algorithm  to  exclude  the  faulty  node).  It  never  gives  up  and  has  a  well- 
defined  restart  and  recovery  strategy  from  fault  arrivals  that  exceed  this  hypothesis. 

The  prototype  implementations  of  TTA  have  been  subjected  to  extensive  testing  and 
fault  injections,  and  deployed  in  experimental  vehicles.  Several  of  its  algorithms  have 
been  formally  verified  [13, 14],  and  aircraft  applications  under  development  are  planned 
to  lead  to  FAA  certification.  It  is  supported  by  an  extensive  tool  suite  that  interfaces  to 
standard  CAD  environments  (e.g.,  Matlab/Simulink  and  Beacon).  Current  implementa¬ 
tions  provide  25  Mbit/s  data  rates;  research  projects  are  designing  implementations  for 
gigabit  rates.  TTA  controllers  and  the  star  hub  (which  is  basically  a  modified  controller) 
are  quite  simple  and  cheap  to  produce  in  volume. 


21 


Of  the  architectures  considered  here,  TTA  is  unique  in  being  used  for  both  auto¬ 
mobile  applications,  where  volume  manufacture  leads  to  very  low  prices,  and  aircraft, 
where  a  mature  tradition  of  design  and  certification  for  flight-critical  electronics  pro¬ 
vides  strong  scrutiny  of  arguments  for  safety. 

SPIDER.  A  Scalable  Processor-Independent  Design  for  Electromagnetic  Resilience 
(SPIDER)  is  being  developed  by  Paul  Miner  and  colleagues  at  the  NASA  Langley  Re¬ 
search  Center  as  a  research  platform  to  explore  recovery  strategies  for  radiation-induced 
(HIRF/EMI)  faults,  and  to  serve  as  a  case  study  to  exercise  the  recent  design  assurance 
guidelines  for  airborne  electronic  hardware  (DO-254)  [15], 

SPIDER  uses  a  star  configuration  similar  to  that  shown  in  Figure  2,  in  which  the 
interfaces  (called  BIUs)  may  be  located  either  with  their  hosts  or  in  the  centralized  hub, 
which  also  contains  active  elements  called  Redundancy  Management  Units,  or  RMUs. 

Clock  synchronization  and  other  services  of  SPIDER  are  achieved  by  novel  dis¬ 
tributed  algorithms  executed  among  the  BIUs  and  RMUs  [11].  The  services  provided 
include  interactively  consistent  message  broadcasts,  and  identification  of  failed  nodes 
(from  which  a  membership  service  can  easily  be  synthesized).  SPIDER’S  fault  hypoth¬ 
esis  uses  a  hybrid  fault  model,  which  includes  arbitrary  faults,  and  allows  some  com¬ 
binations  of  multiple  faults.  Its  algorithms  are  novel  and  highly  efficient  and  are  being 
formally  verified. 

SPIDER  is  an  interesting  design  that  uses  a  different  topology  and  a  different  class 
of  algorithms  from  the  other  buses  considered  here.  However,  it  is  a  research  project 
whose  design  and  implementation  are  still  in  progress  and  so  it  cannot  be  compared 
directly  with  the  commercial  products. 

FlexRay.  A  consortium  including  BMW,  DaimlerChrysler,  Motorola,  and  Philips,  is 
developing  FlexRay  for  powertrain  and  chassis  control  in  cars.  It  differs  from  the  other 
buses  considered  here  in  that  its  operation  is  divided  between  time-triggered  and  event- 
triggered  activities.  Published  descriptions  of  the  FlexRay  protocols  and  implementa¬ 
tion  are  sketchy  at  present  [3]  (see  also  the  Web  site  www .  f  lexray-group .  com). 

FlexRay  can  use  either  an  “active”  star  interconnect  similar  to  that  shown  in  Figure 
2,  or  a  “passive”  bus  similar  to  that  shown  in  Figure  1 .  In  both  cases,  duplication  of  the 
interconnect  is  optional.  The  star  configuration  of  FlexRay  (and  also  that  of  TTA)  can 
also  be  deployed  in  distributed  configurations  where  subsystems  are  connected  by  hub- 
to-hub  links.  Each  FlexRay  interface  (it  is  called  a  communication  controller)  drives 
the  lines  to  its  interconnects  through  separate  bus  guardians  located  with  the  interface. 
(This  means  that  with  two  buses,  each  node  has  three  clocks:  one  for  the  controller  and 
one  for  each  of  the  two  guardians;  this  differs  from  the  bus  configuration  of  TTA  where 
there  is  one  clock  for  the  controller  and  both  guardians  share  a  second  clock.)  Like  the 
bus  configuration  of  TTA,  the  guardians  of  FlexRay  are  not  fully  independent  of  their 
controllers. 

FlexRay  aims  to  be  more  flexible  than  the  other  buses  considered  here,  and  this 
seems  to  be  reflected  in  the  choice  of  its  name.  As  noted,  one  manifestation  of  this 
flexibility  is  its  combination  of  time-  and  event-triggered  operation.  FlexRay  partitions 
each  time  cycle  into  a  “static”  time-triggered  portion,  and  a  “dynamic”  event-triggered 
portion.  The  division  between  the  two  portions  is  set  at  design  time  and  loaded  into 
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the  controllers  and  bus  guardians.  Communication  during  the  event-driven  portion  of 
the  cycle  uses  the  Byteflight  protocol.  Unlike  SAFEbus  and  TTA,  FlexRay  does  not 
install  the  full  schedule  for  the  time-triggered  portion  in  each  controller.  Instead,  this 
portion  of  the  cycle  is  divided  into  a  number  of  slots  of  fixed  size  and  each  controller 
and  its  bus  guardians  are  informed  only  of  those  slots  allocated  to  their  transmissions 
(nodes  requiring  greater  bandwidth  are  assigned  more  slots  than  those  that  require  less). 
Controllers  learn  the  full  schedule  only  when  the  bus  starts  up.  Each  node  includes 
its  identity  in  the  messages  that  it  sends;  during  startup,  nodes  use  these  identifiers  to 
label  their  input  buffers  as  the  schedule  reveals  itself  (e.g.,  if  the  messages  that  arrive 
in  slots  1  and  7  carry  identifier  3,  then  all  nodes  will  thereafter  deliver  the  contents 
of  buffers  1  and  7  to  the  task  that  deals  with  input  from  node  3).  There  appears  to  be  a 
vulnerability  here:  a  faulty  node  could  masquerade  as  another  (i.e.,  send  a  message  with 
the  wrong  identifier)  during  startup  and  thereby  violate  partitioning  for  the  remainder 
of  the  mission.  It  is  not  clear  how  this  fault  mode  is  countered. 

Like  TTA,  FlexRay  uses  the  Lundelius-Welch  clock  synchronization  algorithm  but, 
unlike  TTA,  it  does  not  use  a  membership  algorithm  to  exclude  faulty  nodes.  FlexRay 
provides  no  services  to  its  applications  beyond  best-efforts  message  delivery;  in  par¬ 
ticular,  it  does  not  provide  interactively  consistent  message  broadcasts.  This  means  that 
all  mechanisms  for  fault-tolerant  applications  must  be  provided  by  the  applications  pro¬ 
grams  themselves.  Published  descriptions  of  FlexRay  do  not  specify  its  fault  hypothesis, 
and  it  appears  to  have  no  mechanisms  to  counter  certain  fault  modes  (e.g.,  SOS  faults  or 
other  sources  of  asymmetric  broadcasts,  and  masquerading  on  startup).  A  never-give-up 
strategy  has  not  been  described,  nor  have  systematic  or  formal  approaches  to  assurance 
and  certification. 

FlexRay  is  interesting  because  of  its  mixture  of  time-  and  event-triggered  operation, 
and  potentially  important  because  of  the  industrial  clout  of  its  developers.  Currently,  it 
is  the  slowest  of  the  commercial  buses,  with  a  claimed  data  rate  of  no  more  than  10 
Mbit/s. 

6  Summary  and  Conclusion 

A  safety-critical  bus  architecture  provides  certain  properties  and  services  that  assist  in 
construction  of  safety-critical  systems.  As  with  any  system  framework  or  middleware 
package,  these  buses  offer  a  tradeoff  to  system  developers;  they  provide  a  coherent 
collection  of  services,  with  strong  properties  and  highly  assured  implementations,  but 
developers  must  sacrifice  some  design  freedom  to  gain  the  full  benefit  of  these  services. 
For  example,  all  these  buses  use  a  time-triggered  model  of  computation,  and  system 
developers  must  build  their  applications  within  that  framework.  In  return,  the  buses  are 
able  to  guarantee  strong  partitioning:  faults  in  individual  components  or  applications 
(“functions”  in  avionics  terms)  cannot  propagate  to  others,  nor  can  they  bring  down  the 
entire  bus  (within  the  constraints  of  its  fault  hypothesis). 

Partitioning  is  the  minimum  requirement,  however.  It  ensures  that  one  failed  func¬ 
tion  will  not  drag  down  others,  but  in  many  safety-critical  systems  the  failure  of  even 
a  single  function  can  be  catastrophic,  so  the  individual  functions  must  themselves  be 
made  fault  tolerant.  Accordingly,  most  of  the  buses  provide  mechanisms  to  assist  the 
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development  of  fault-tolerant  applications.  The  key  requirement  here  is  interactively 
consistent  message  transfer:  this  ensures  that  all  masters  and  shadows  (or  masters  and 
monitors),  or  all  members  of  a  voting  pool,  maintain  consistent  state.  Three  of  the  buses 
considered  here  provide  this  basic  service;  some  of  them  do  so  in  association  with  other 
services,  such  master/shadow  rollover  or  group  membership,  that  can  be  provided  with 
much  increased  efficiency  and  much  reduced  latency  when  implemented  at  a  low  level. 
FlexRay,  alone,  provides  none  of  these  services.  In  their  absence,  all  mechanisms  for 
fault-tolerance  must  be  implemented  in  applications  programs.  Thus,  application  pro¬ 
grammers,  who  may  have  little  experience  in  the  subtleties  of  fault-tolerant  systems,  be¬ 
come  responsible  for  the  design,  implementation,  and  assurance  of  very  delicate  mech¬ 
anisms  with  no  support  from  the  underlying  bus  architecture.  Not  only  does  this  in¬ 
crease  the  cost  and  difficulty  of  making  sure  that  things  are  done  right,  it  also  increases 
their  computational  cost  and  latency.  For  example,  in  the  absence  of  an  interactively 
consistent  message  service  provided  by  the  architecture,  applications  programs  must 
explicitly  transmit  the  multiple  rounds  of  cross-comparisons  that  are  needed  to  imple¬ 
ment  this  service  at  a  higher  level,  thereby  substantially  increasing  the  message  load. 
Such  a  cost  will  invite  inexperienced  developers  to  seek  less  expensive  ways  to  achieve 
fault  tolerance — in  probable  ignorance  of  the  impossibility  results  in  the  theoretical 
literature,  and  the  history  of  intractable  “Heisenbugs”  (rare,  unrepeatable,  failures)  en¬ 
countered  by  practitioners  who  pushed  for  10-9  with  inadequate  foundations. 

It  is  unlikely  that  any  single  bus  architecture  will  satisfy  all  needs  and  markets,  so 
it  is  possible  that  FlexRay’s  lack  of  application-level  fault-tolerant  services  will  find 
favor  in  some  areas.  It  is  also  to  be  expected  that  new  or  modified  architectures  will 
emerge  to  satisfy  new  markets  and  requirements.  (For  example,  it  is  proposed  that  TTA 
could  match  FlexRay’s  ability  to  support  event-triggered  as  well  as  time-triggered  com¬ 
munications  by  allocating  certain  time  slots  to  a  simulation  of  CAN;  the  simulation  is 
actually  faster  than  a  real  CAN  bus,  while  retaining  all  the  safety  attributes  of  TTA.)  I 
hope  that  the  description  provided  here  will  help  potential  users  to  evaluate  existing  ar¬ 
chitectures  against  their  own  needs,  and  that  it  will  help  designers  of  new  architectures 
to  learn  from  and  build  on  the  design  choices  made  by  their  predecessors. 
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Abstract.  A  complex  real-time  embedded  system  may  consist  of 
multiple  application  components  each  of  which  has  its  own  time¬ 
liness  requirements  and  is  scheduled  by  component-specific  sched¬ 
ulers.  At  run-time,  the  schedules  of  the  components  are  integrated 
to  produce  a  system-level  schedule  of  jobs  to  be  executed.  We  for¬ 
malize  the  notions  of  schedule  composition,  task  group  composition 
and  component  composition.  Two  algorithms  for  performing  com¬ 
position  are  proposed.  The  first  one  is  an  extended  Earliest  Deadline 
First  algorithm  which  can  be  used  as  a  composability  test  for  sched¬ 
ules.  The  second  algorithm,  the  Harmonic  Component  Composition 
algorithm  (HCC)  provides  an  online  admission  test  for  components. 
HCC  applies  a  rate  monotonic  classification  of  workloads  and  is  a 
hard  real-time  solution  because  responsive  supply  of  a  shared  re¬ 
source  is  guaranteed  for  in-budget  workloads.  HCC  is  also  efficient 
in  terms  of  composability  and  requires  low  computation  cost  for 
both  admission  control  and  dispatch  of  resources. 


1  Introduction 

The  integration  of  components  in  complex  real-time  and  embedded  systems  has 
become  an  important  topic  of  study  in  recent  years.  Such  a  system  may  be  made 
up  of  independent  application  (functional)  components  each  of  which  consists 
of  a  set  of  tasks  with  its  own  specific  timeliness  requirements.  The  timeliness 
requirements  of  the  task  group  of  a  component  is  guaranteed  by  a  scheduling 
policy  specific  to  the  component,  and  thus  the  scheduler  of  a  complex  embedded 
system  may  be  composed  of  multiple  schedulers.  If  these  components  share  some 
common  resource  such  as  the  CPU,  then  the  schedules  of  the  individual  compo¬ 
nents  are  interleaved  in  some  way.  In  extant  work,  a  number  of  researchers  have 
proposed  algorithms  to  integrate  real-time  schedulers  such  that  the  timeliness 
requirements  of  all  the  application  task  groups  can  be  simultaneously  met.  The 

*  This  work  is  supported  in  part  by  a  grant  from  the  US  Office  of  Naval  Research  under 
grant  number  N00014-99- 1-0402  and  N00014-98- 1-0704,  and  by  a  research  contract 
from  SRI  International  under  a  grant  from  the  NEST  program  of  DARPA 
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most  relevant  work  in  this  area  includes  work  in  “open  systems”  and  “hierarchi¬ 
cal  schedulers”  which  we  can  only  briefly  review  here.  Deng  and  Liu  proposed 
the  open  system  environment,  where  application  components  may  be  admitted 
online  and  the  scheduling  of  the  component  schedulers  is  performed  by  a  ker¬ 
nel  scheduler  [2].  Mok  and  Feng  exploited  the  idea  of  temporal  partitioning  [5], 
by  which  individual  applications  and  schedulers  work  as  if  each  one  of  them 
owns  a  dedicated  “real-time  virtual  resource”.  Regehr  and  Stankovic  investi¬ 
gated  hierarchical  schedulers  [7].  Fohler  addressed  the  issue  of  how  to  dynami¬ 
cally  schedule  event-triggered  tasks  together  with  an  offline-produced  schedule 
for  time-triggered  computation  [3].  In  [9]  by  Wang  and  Mok,  two  popular  sched¬ 
ulers:  the  cyclic  executive  and  fixed-priority  schedulers  form  a  hybrid  scheduling 
system  to  accommodate  a  combination  of  periodic  and  sporadic  tasks. 

All  of  the  works  cited  above  address  the  issue  of  schedule/scheduler  composi¬ 
tion  based  on  different  assumptions.  But  what  exactiy  are  the  conditions  under 
which  the  composition  of  two  components  is  correct?  Intuitiveiy,  the  minimum 
guarantee  is  that  the  composition  preserves  the  timeiiness  of  the  tasks  in  aii 
the  task  groups.  But  in  the  case  an  appiication  scheduier  may  produce  differ¬ 
ent  schedufes  depending  on  the  exact  time  instants  at  which  scheduiing  decisions 
are  made,  must  the  composition  of  components  aiso  preserve  the  exact  schedufes 
that  woufd  be  produced  by  the  individuaf  appfication  schedufers  if  they  were  to 
run  on  dedicated  CPUs?  Such  considerations  may  be  important  if  an  appiication 
programmer  reiies  on  the  exact  sequencing  of  jobs  that  is  produced  by  the  ap¬ 
piication  scheduier  and  not  oniy  the  semantics  of  the  scheduier  to  guarantee  the 
correct  functioning  of  the  appiication  component.  For  example,  an  application 
programmer  might  manipulate  the  assignment  of  priorities  such  that  a  fixed  pri¬ 
ority  scheduler  produces  a  schedule  that  is  the  same  as  that  produced  by  a  cyclic 
executive  for  an  application  task  group;  this  simulation  of  a  cyclic  executive  by  a 
fixed  priority  scheduler  may  create  trouble  if  the  fixed  priority  scheduler  is  later 
on  composed  with  other  schedulers  and  produces  a  different  schedule  which  does 
not  preserve  the  task  ordering  in  the  simulated  cyclic  executive.  Hence,  we  need 
to  pay  attention  to  semantic  issues  in  scheduler  composition. 

In  this  paper,  we  propose  to  formalize  the  notions  of  composition  on  three 
levels:  schedule  composition,  task  group  composition  and  component  compo¬ 
sition.  Based  on  the  formalization,  we  consider  the  questions  of  whether  two 
schedules  are  composable,  and  how  components  may  be  efficiently  composed. 
Our  formalization  takes  into  account  the  execution  order  dependencies  (explicit 
or  implicit)  between  tasks  in  the  same  component.  For  example,  in  cyclic  exec¬ 
utive  schedulers,  a  deterministic  order  is  imposed  on  the  execution  of  tasks  so 
as  to  satisfy  precedence,  mutual  exclusion  and  other  relations.  As  is  common 
practice  to  handle  such  dependencies,  sophisticated  search-based  algorithms  are 
used  to  produce  the  deterministic  schedules  offline,  e.g.,  [8].  To  integrate  such 
components  into  a  complex  system,  we  consider  composition  with  the  view  that: 
First,  the  correctness  of  composition  should  not  depend  on  knowledge  about  how 
the  component  schedules  are  produced,  i.e.,  compositionality  is  fundamentally  a 
predicate  on  schedules  and  not  schedulers.  Second,  the  composition  of  schedules 
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should  be  order  preserving  with  respect  to  its  components,  i.e.,  if  job  x  is  sched¬ 
uled  before  job  y  in  a  component  schedule,  then  job  x  is  still  scheduled  before 
y  in  the  integrated  system  schedule.  Our  notion  of  schedule  composition  is  an 
interleaving  of  component  schedules  that  allows  preemptions  between  jobs  from 
different  components. 

The  contributions  of  this  paper  include:  formal  definitions  of  schedule  compo¬ 
sition,  task  group  composition  and  component  composition,  an  optimal  schedule 
composition  algorithm  for  static  schedules  and  a  harmonic  component  composi¬ 
tion  algorithm  that  has  low  computation  cost  and  also  provides  a  responsiveness 
guarantee.  The  rest  of  the  paper  is  organized  as  follows.  Section  2  defines  basic 
concepts  used  in  the  rest  of  the  paper.  Section  3  addresses  schedule  composition. 
Section  4  defines  and  compares  task  group  composition  and  component  com¬ 
position.  Section  5  defines,  illustrates  and  analyzes  the  Harmonic  Component 
Composition  approach.  Section  6  compares  HCC  with  related  works.  Section  7 
concludes  the  paper  by  proposing  future  work. 

2  Definitions 

2.1  Task  Models 

Time  is  defined  on  the  domain  of  non-negative  real  numbers,  and  the  time 
interval  between  time  b  and  time  e  is  denoted  by  (b,  e).  We  shall  also  refer  to  a 
time  interval  (i,i  +  1)  where  i  is  a  non-negative  integer  as  a  time  unit. 

A  resource  is  an  object  to  be  allocated  to  tasks.  It  can  be  a  CPU,  a  bus, 
or  a  packet  switch,  etc.  In  this  paper,  we  shall  consider  the  case  of  a  single 
resource  which  can  be  shared  by  the  tasks  and  components,  and  preemption  is 
allowed.  We  assume  that  context  switching  takes  zero  time;  this  assumption  can 
be  removed  in  practice  by  adding  the  appropriate  overhead  to  the  task  execution 
time. 

A  job  is  defined  by  a  tuple  of  three  attributes  ( c,r,d )  each  of  which  is  a 
non-negative  real  number: 

—  c  is  the  execution  time  of  a  job,  which  defines  the  amount  of  time  that  must 
be  allocated  to  the  job; 

—  r  is  the  ready  time  or  arrival  time  of  the  job  which  is  the  earliest  time  at 
which  the  job  can  be  scheduled; 

—  d  is  the  deadline  of  the  job  which  is  the  latest  time  by  which  the  job  must 
be  completed. 

A  task  is  an  infinite  sequence  of  jobs.  Each  task  is  identified  by  a  unique  ID 
i.  A  task  is  either  periodic  or  sporadic. 

The  set  of  periodic  tasks  in  a  system  is  represented  by  Tp.  A  periodic  task  is 
denoted  by  (i,  ( c,p,d )),  where  i  identifies  the  task,  and  tuple  (c,p,d)  defines  the 
attributes  of  its  jobs.  The  jth.  job  of  i  is  denoted  by  job  (i,  j). 

Suppose  X  identifies  an  object  and  Y  is  one  of  the  attributes  of  the  object, 
we  shall  use  the  notation  X.Y  to  denote  the  attribute  Y  of  X.  For  instance,  if 
(i,j)  identifies  a  job,  then  ( i,j).d  denotes  the  deadline  of  job  ( i,j ). 
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The  attributes  in  the  definition  of  a  periodic  task,  c,  p  and  d,  are  non-negative 
real  numbers: 

—  c  is  the  execution  time  of  a  task,  which  defines  the  amount  of  time  that  must 
be  allocated  to  each  job  of  the  task; 

—  p  is  the  period  of  the  task; 

—  d  is  the  relative  deadline  of  the  task,  which  is  the  maximal  length  of  time  by 
which  a  job  must  be  completed  after  its  arrival.  We  assume  that  for  every 
periodic  task,  c  <  d  <  p. 

If  a  periodic  task  i  is  defined  by  (c,p,d),  job  (i,  j)  is  defined  by  ( c,j  ■  p,j  ■  p  +  d). 

A  sporadic  task  is  denoted  by  a  tuple  (i,  ( c,p,d )),  where  i  identifies  the  task, 
and  (c,p,  d )  defines  the  attributes  of  its  jobs,  as  follows:  The  jth  job  of  sporadic 
task  i  is  identified  as  job  j  >  0.  The  arrival  times  of  jobs  of  a  sporadic  task 
are  not  known  a  priori  and  are  determined  at  run  time  by  an  arrival  function  A 
that  maps  each  job  of  a  sporadic  task  to  its  arrival  time  for  the  particular  run: 

A  ::  Ts  x  N  -»  R,  where  N  is  the  set  of  natural  numbers  and  R  is  the  set  of 
real  numbers. 

A(i,j)  =  t  if  the  job  (i,j)  arrives  at  time  t. 

A(i,j)  =±  if  the  job  (i,  j)  never  arrivals. 

The  attributes  c  and  d  of  a  sporadic  task  are  defined  the  same  as  those  of 
a  periodic  task.  However,  attribute  p  of  a  sporadic  task  represents  the  minimal 
interval  between  the  arrival  times  of  any  two  consecutive  jobs.  In  terms  of  the 
function  A,  A(i,  ( j  +  1))  —  A(i,  j)  >  p  if  A(i ,  ( j  +  1))  is  defined. 

For  a  sporadic  task  (i,  ( c,p,d )),  job  (i,j)  is  defined  as  (c,  A(i,  j),  A(i,j)  +d). 
A  task  group  TG  consists  of  a  set  of  tasks  (either  periodic  or  sporadic).  We 
shall  use  STG  to  denote  a  set  of  task  groups.  The  term  component  denotes  a 
task  group  and  its  scheduler.  Sometimes  we  call  a  task  group  an  application 
task  group  to  emphasize  its  association  with  a  component  which  is  one  of  many 
applications  in  the  system. 


2.2  Schedule 

A  resource  supply  function  Sup  defines  the  maximal  time  that  can  be  supplied  to 
a  component  from  time  0  to  time  t.  Time  supply  function  must  be  monotonically 
non-decreasing.  In  other  words,  if  t  <  t' ,  then  Sup(t)  <  Sup(t'). 

The  function  S  maps  each  job  to  a  set  of  time  intervals: 

S  ::  TG  x  N  -»  {(R,R)}  where  TG  is  a  task  group,  and  N  and  R  represent 
the  set  of  natural  numbers  and  the  set  of  real  numbers  respectively. 

S(i,j)  =  {(bi,j,k}^i,j,k) |0  <  k  <  h}  where  k  and  h  are  natural  numbers. 

S'  is  a  schedule  of  TG  under  supply  function  Sup  if  and  only  if  all  of  the 
following  conditions  are  satisfied: 

—  Constraint  1:  For  every  job  (i,  j),  every  time  interval  assigned  to  it  in  the 
schedule  must  be  assigned  in  a  time  interval  allowed  by  the  supply  function, 
i.e.,  for  all  (6,  e)  £  S(i,j),  Sup(e)  —  Sup(b)  =  e  —  b. 
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—  Constraint  2:  The  resource  is  allocated  to  at  most  one  job  at  a  time,  i.e., 
time  intervals  do  not  overlap:  For  every  (bi,j,k,ei,j,k)  €  S(i,j)  and  for  every 
(bi'  ji tk'  ,e*'  ji tk')  €  S(i',j'),  one  of  the  following  cases  must  be  true: 

®  }j>  ,  or 

*  &i' ,j' ,k'  S:  or 

•  i  =  i',  j  =  j'  and  k  =  k' . 

—  Constraint  3:  A  job  must  be  scheduled  between  its  ready  time  and  deadline: 
for  every  (b,e)  £  S{i,j), 


(' i,j)-r  <  b  <  e  <  ( i,j).d 

—  Constraint  4:  For  every  job  (i,  j),  the  total  length  of  all  time  intervals  in 
S(i,j)  is  sufficient  for  executing  the  job,  i.e., 

Y  (e~b)  >(i,j)-c 

(b,e)ES(iJ) 

Given  a  time  t,  if  there  exists  a  time  interval  (b,  e)  in  S(i,j)  such  that  b  < 
t  <  e,  then  job  (i,j)  is  scheduled  at  time  t,  and  task  i  is  scheduled  at  time  t. 

An  algorithm  Sch  is  a  scheduler  if  and  only  if  it  produces  a  schedule  S  for 
T  under  A  and  Sup. 

A  component  C  of  a  system  is  defined  by  a  tuple  ( TG ,  Sch)  which  specifies  the 
task  group  to  be  scheduled  and  the  task  group’s  scheduler.  A  set  of  components 
will  be  written  as  SC. 

3  Schedule  Composition 

Suppose  Sh  is  a  schedule  of  a  component  task  group  TG/,.  We  say  that  the 
schedule  S  integrating  the  component  schedules  in  (J  TGh  is  a  composed  schedule 
of  all  component  schedules  {5/,|0  <  h  <  n  —  1}  if  and  only  if  there  exists  a 
function  M  which  maps  each  scheduled  time  interval  in  Sh  to  a  time  window 
subject  to  the  following  conditions: 

—  For  each  time  interval  (b,e)  £  Sh(i,j),  M(h,  (b,e))  =  (bh,e/,),  and  ( b/,,e/, )  is 
within  the  ready  time  and  deadline  of  job  (i,  j); 

—  The  time  scheduled  to  job  (i,j)  by  S  between  (bh,en)  is  equal  to  e  —  b: 

Y  (y-x)=e-b 

(x,y)es(i,j)  and  bh<x<y<eh 

—  M(h,(b,e))  is  before  M(h,(b',e'))  if  and  only  if  (b,e)  £  Sh(i,j)  is  before 
(b',e')£Sh(i',j'). 

The  notion  of  schedule  composition  is  illustrated  in  Figure  1  where  the  com¬ 
ponent  schedule  So  is  interleaved  with  other  component  schedules  into  a  com¬ 
posed  schedule  S.  Notice  that  the  time  intervals  occupied  by  So  can  be  mapped 
into  S  without  changing  the  order  of  these  time  intervals. 
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Schedule  SO: 


Mapping 
Function  M: 


Systen  Schedule  S: 


|H  Occupied  by  Other  Schedules  Occupied  by  SO 


Idle 


Fig.  1.  Definition  of  Schedule  Composition 


To  test  whether  a  set  of  schedules  can  be  integrated  into  a  composed  sched¬ 
ule,  we  now  propose  an  extended  Earliest  Deadline  First  algorithm  for  schedule 
composition.  From  the  definition  of  a  schedule,  the  execution  of  a  job  ( i ,  j)  can  be 
scheduled  into  a  set  of  time  intervals  by  a  schedule  S.  We  use  the  term  S  (i ,  j )  to 
denote  the  set  of  time  intervals  job  ( i,j )  occupies.  In  the  following,  we  shall  refer 
to  a  time  interval  in  S(i,j)  as  a  job  fragment  of  the  job  (i,  j).  The  schedule  com¬ 
position  algorithm  works  as  follows.  A  job  fragment  is  created  corresponding  to 
the  first  time  interval  of  the  first  job  in  each  component  schedule  that  has  not 
been  integrated  into  S ,  and  the  job  fragments  from  all  schedules  are  scheduled 
together  by  EDF.  After  the  job  fragment,  say  for  schedule  ,S’/t  has  completed, 
the  job  fragment  is  deleted  and  another  job  fragment  is  created  corresponding 
to  the  next  time  interval  in  schedule  Sh- 

The  schedule  composition  algorithm  is  defined  below. 

—  Initially,  all  job  fragments  from  all  component  schedules  are  unmarked. 

—  At  any  time  t ,  Ready  is  a  set  that  contains  all  the  job  fragments  from  all 
the  component  schedules  that  are  ready  to  be  composed.  Initially,  Ready  is 
empty. 

—  At  any  time  t,  if  there  is  no  job  fragment  from  component  schedule  Si,  in 
Ready ,  construct  one  denoted  as  ( h,c,r,d )  by  the  following  steps: 

•  Let  ( b,e )  be  an  unmarked  time  interval  such  that  (b,  e)  £  Sh{i,j )  and 
for  all  unmarked  time  interval  (&',  e')  €  Sh(i',j'),  b  <  b'\ 

•  Define  the  execution  time  of  the  job  fragment  as  the  length  of  the  sched¬ 
uled  time  interval:  c  :=  e  —  6; 

•  Define  ready  time  of  the  job  fragment  as  the  ready  time  of  the  job  sched¬ 
uled  at  (b,  e):  r  :=  ( i,j).r ; 

•  Define  deadline  of  the  job  fragment  as  the  earliest  deadline  among  all 
jobs  scheduled  after  time  b  by  S'/,: 

d  :=  min({(i',j').d\(b',e')  £  Sh{i',j')  and  b  <  6'}) 

•  Mark  interval  (b,  e). 

—  Allocate  the  resource  to  the  job  fragment  in  Ready  that  is  ready  and  has 
the  earliest  deadline. 
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—  If  the  accumulated  time  allocated  to  job  fragment  is  equal  to  the  execution 
time  of  the  job  fragment,  delete  the  job  fragment  from  Ready. 

—  If  t  is  equal  to  the  deadline  of  a  job  fragment  before  the  completion  of  the 
corresponding  job  in  Ready,  the  schedule  composition  fails. 

In  the  above,  the  time  intervals  within  a  component  schedule  Sh  are  trans¬ 
formed  into  job  fragments  and  put  into  Ready  one  by  one  in  their  original  order 
in  Sh-  At  any  time  t,  just  one  job  fragment  from  Sh  is  in  Ready.  Therefore,  the 
order  of  time  intervals  in  a  component  schedule  is  preserved  in  the  composed 
schedule. 

The  extended  EDF  is  optimal  in  terms  of  composability.  In  other  words,  if 
a  composed  schedule  exists  for  a  given  set  of  component  schedules,  then  the 
extended  EDF  produces  one. 

Theorem  1  The  extended  EDF  is  an  optimal  schedule  composition  algorithm. 

Proof:  If  the  extended  EDF  for  composition  fails  at  time  /,  then  let  s  be 
the  latest  time  that  following  conditions  are  all  true:  for  any  Sh,  there  exists 
(b,  e)  £  Sh(i,j),  ( i,j).r  >  s,  all  time  intervals  before  b  in  5*  are  composed 
into  S  no  later  than  time  s,  and  for  all  (b1,  e')  composed  between  s  and  /,  the 
corresponding  job  fragment  has  deadline  no  later  than  /.  Then  for  any  time  t 
between  (s,  /),  there  is  a  {b1  ,e')  £  S(i',j')  and  b1  <t  <  e' .  The  aggregate  length 
of  time  intervals  from  component  schedules  that  must  be  integrated  between 
(s,  f )  is  larger  than  f  —  s,  therefore  no  schedule  composition  exists.  ■ 

Because  of  its  optimality,  the  extended  EDF  is  a  composability  test  for  any 
set  of  schedules.  Although  extend  EDF  is  optimal,  this  approach,  however,  has 
a  limitation:  the  input  component  schedules  must  be  static.  In  other  words,  to 
generate  system  schedule  at  time  t,  the  component  schedules  after  time  t  need  to 
be  known.  Otherwise,  the  deadline  of  the  pseudo  job  in  Ready  cannot  be  decided 
optimally.  Therefore,  the  extended  EDF  schedule  composition  approach  cannot 
be  applied  optimally  to  dynamically  produced  schedules. 

4  Task  Group  Composability  and  Component 
Composability 

We  say  that  a  set  of  task  groups  STG={TGo,  ..,TG„_i}  is  weakly  composable  if 
and  only  if  the  following  holds:  Given  any  set  of  arrival  functions  {Ao, A„_ i} 
for  the  task  groups  in  STG,  for  any  0  <  k  <  n  —  1,  there  exists  a  schedule 
Sk  for  T G k  under  A/.,  and  SS  =  {.S'o, ..,  S„-i }  is  composable.  Obviously,  weak 
composability  is  equivalent  to  the  schedulability  of  task  group  Ustg  TGu- 
We  say  that  a  set  of  task  groups  STG  is  strongly  composable  if  and  only  if  the 
following  holds:  Given  any  schedule  Sk  of  TGk  under  any  A /. ,  SS  =  {So, ..,  S„_i} 
is  composable.  The  following  is  a  simple  example  of  strong  composability. 

Suppose  there  are  two  task  groups.  TGo  consists  of  a  periodic  task  To  = 
(1,5,5),  and  TG\  consists  of  a  sporadic  task  Ti  =  (1,5,5).  Then  an  arbitrary 
schedule  So  for  TGo  and  an  arbitrary  schedule  Si  of  TG\  can  always  be  composed 
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into  a  schedule  5  by  the  extended  EDF  no  matter  what  the  arrival  function  is. 
Therefore,  this  set  of  task  groups  are  strongly  composable. 

Not  all  weakly  composable  sets  of  task  groups  are  strongly  composable.  Sup¬ 
pose  we  change  the  above  example  of  strongly  composable  set  of  task  groups  by 
adding  another  periodic  task  T2  =  (4, 10, 10)  to  task  group  TGq.  Two  schedules 
can  be  produced  for  TGo  by  a  fixed  priority  schedulers:  So  and  S’/, .  In  So,  suppose 
we  give  a  higher  priority  to  To,  and  therefore  for  all  j,  S'o(0,  j)  =  (5  ■  j,  5  •  j  +  1), 
and  So{2,j)  =  (10  •  j  +  1, 10  •  j  +  5).  For  S'/, ,  suppose  we  give  higher  prior¬ 
ity  to  To,  and  therefore  for  any  number  j,  S!A0, 2j)  =  (10  •  j  +  4, 10  •  j  +  5), 
5' (  0, 2,  +  1)  =  (10  •  ,  +  5, 10  •  /+  6);  5'  (2,j)  =  (10 -j,  10  •  j  +  4). 

So  is  composable  with  any  schedule  Si  of  T6'i,  but  Sq  is  not.  In  Sq,  for  any 
j,  the  deadline  of  job  (0, 2  •  j)  is  at  10  ■  j  +  5,  and  yet  it  is  scheduled  after  job 
(2,  j)  whose  deadline  is  at  10  •  j  + 10.  Because  of  the  order-preserving  property  of 
schedule  composition,  it  follows  that  every  time  interval  (10  •  j,  10  ■  j  +  5)  must 
be  assigned  to  Sq.  Thus,  if  a  job  of  Ti  arrives  at  time  10  -j,  schedule  composition 
becomes  impossible. 

We  say  that  a  set  of  supply  functions  SSup={Supo, ..,  Supn- 1}  is  consistent 
if  and  only  if  the  aggregate  time  supply  of  all  functions  between  any  time  interval 
( b ,  e )  is  less  than  or  equal  to  the  length: 

^2  (Supk{e)  -  Supk{b))  <e-b 

Suppose  SC  =  {(Scho,TGo),  ..,(Schn-i,TGn-i)}  is  a  set  of  components. 
SC  is  composable  if  and  only  if  given  any  set  of  arrival  functions  514  =  {Aq,  .., 
An_i},  there  exists  a  set  of  consistent  supply  functions  SSup  =  {  Supo , ..,  Supn- 1} 
such  that  Schk  produces  schedule  Sk  of  TGk  under  arrival  function  A /.  and  sup¬ 
ply  function  Supk,  and  SS  =  { S'o , ..,  S'„-i }  is  composable. 

Component  composability  lies  between  weak  composability  and  strong  com- 
posability  of  task  groups  in  the  following  sense.  A  component  has  its  own  sched¬ 
uler  which  may  produce  for  a  given  arrival  function,  a  schedule  among  a  number 
of  valid  schedules  under  the  arrival  function.  Therefore,  given  a  set  of  compo¬ 
nents,  if  the  corresponding  set  of  task  groups  of  these  components  are  strongly 
composable,  then  the  components  are  composable;  if  the  task  groups  are  not 
even  weakly  composable,  the  components  are  not  composable.  However,  when 
the  task  groups  are  weakly  but  not  strongly  composable,  component  compos¬ 
ability  depends  on  the  specifics  of  component  schedulers. 

To  illustrate  these  concepts,  we  compare  weak  task  group  composability, 
strong  task  group  composability  and  component  composability  in  the  following 
example  which  is  depicted  in  Figure  2.  Suppose  there  are  two  components  Co  = 
(TGo,  Scho)  and  C\  =  (TGi,  Schi).  For  any  valid  arrival  function  A  for  each  of 
the  task  groups,  there  exists  in  general  a  set  of  schedules  that  may  correspond  to 
the  execution  of  the  task  group  under  the  arrival  function  set.  In  Figure  2,  the 
circle  marked  as  SSo,o  represents  the  set  for  all  possible  schedules  of  TGo  under 
Ao;  and  ,S',S’o;i,  ,S',S'i,o,  551,1  are  defined  similarly.  If  TGo  and  TG i  are  strongly 
composable,  then  randomly  pick  a  schedule  5o  from  SSo,x  and  a  schedule  5i 
from  SSi:V  where  x  and  y  are  variable  and  5q  and  5i  are  composable.  If  TGo 
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and  TGi  are  weakly  composable,  then  for  any  x  and  y,  there  exists  a  schedule 
So  from  SSo.x  and  there  exists  a  schedule  S\  from  SSi.y  such  that  So  and  S\  are 
composable.  The  small  circle  marked  as  55o,o,s  is  the  set  of  all  schedules  that  can 
be  produced  by  the  scheduler  Scho  under  Ao.  Each  point  in  SSo,o,s  corresponds 
to  one  schedule,  and  one  or  multiple  supply  functions  upon  which  Scho  produces 
SSo.o.s-  Circle  SSo.i.s,  SSito,s,  •S'5'i,i,s  are  defined  similarly.  If  components  Co 
and  Ci  are  composable,  then  for  any  pair  of  x  and  y,  there  exists  a  schedule  So 
in  SSo,x,s,  and  a  schedule  5i  in  55i)?/)S,  So  and  Si  are  composable,  and  there 
exists  a  supply  function  Supo  corresponding  to  So  and  a  supply  function  Supi 
corresponding  to  Si,  and  Supo  and  Supi  are  consistent. 


Fig.  2.  Composability 


In  many  scheduler  composition  paradigms,  the  resource  supply  functions  can 
be  determined  only  online  for  components  that  have  unpredictable  arrivals  of 
jobs.  Therefore  it  is  often  hard  to  define  resource  supply  function  a  priori.  How¬ 
ever,  we  can  introduce  the  notion  of  contracts  to  express  the  requirements  im¬ 
posed  on  the  supply  function  by  a  component,  as  the  interface  between  a  com¬ 
ponent  and  the  composition  coordinator.  In  the  next  section,  we  shall  discuss 
Harmonic  Component  Composition  which  makes  use  of  explicit  supply  function 
contracts. 

5  Harmonic  Component  Composition 

We  consider  the  tradeoff  between  composability  and  the  simplicity  in  the  design 
of  the  system-level  scheduler  to  be  a  significant  challenge  in  component  com¬ 
position.  As  an  extreme  case  in  pursuing  simplicity,  a  coordinator  may  allocate 
resources  among  components  based  on  a  few  coarse-grain  parameters  of  each 
component,  such  as  the  worst  case  response  time  and  bandwidth  requirement. 
This  type  of  solutions  often  does  not  achieve  composability,  i.e.,  admission  of 
new  components  may  be  disallowed  even  when  the  aggregate  resource  utiliza¬ 
tion  is  low  because  of  previous  overly  conservative  capacity  commitments.  At  the 
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opposite  extreme,  the  coordinator  may  depend  on  details  about  the  components 
to  perform  complex  analysis  and  may  take  on  too  many  obligations  from  individ¬ 
ual  components,  such  that  the  system  performance  may  eventually  be  degraded. 
We  now  propose  a  solution  to  meet  the  challenge  by  introducing  class-based 
workloads.  We  call  this  approach  Harmonic  Component  Composition  (HCC). 


5.1  Coordinator  Algorithm 

The  system  designer  will  select  a  constant  K  as  the  number  of  resource  classes.  A 
class  k  (k  €  [0,  K ))  is  defined  by  a  class  period  1%  =  mk,  where  m  is  a  designer- 
selected  constant.  We  require  a  rate  monotonic  relation  between  the  periods  of 
classes:  For  any  0  <  l  <  k  <  K  —  1,  ^  =  ml~k .  Lower  class  has  larger  class 
number  and  longer  class  period. 

When  a  component  C  is  ready  to  run,  it  generates  a  supply  contract  and  sends 
it  to  the  coordinator.  The  supply  contract  is  a  list  of  workload  defined  as  ( k ,  l,  w), 
where  k  <  l.  The  workload  permits  that  up  to  w  time  units  of  resource  supply 
can  be  on  demand  within  any  time  interval  of  length  ml\  and  once  a  demand 
occurs,  it  must  be  met  within  mk  time  units.  Upon  receiving  a  supply  contract, 
the  coordinator  will  admit  a  component  if  and  only  if  it  can  satisfy  the  contract 
without  compromising  the  contracts  with  previously  admitted  components. 

When  a  demand  is  proposed  to  class  k,  it  will  be  served  within  mk  time.  To 
keep  this  guarantee,  HCC  maintains  a  straightforward  invariant  to  make  sure 
that  supply  needed  online  for  class  k  or  higher  in  any  time  interval  with  length 
mk  is  less  than  or  equal  to  mk.  To  accomplish  this,  the  aggregate  workload 
admitted  to  class  k  or  higher  is  constrained  as  if  there  is  a  conceptual  resource 
associated  with  class  k  which  is  consumed  by  admitting  any  workload  with  class 
k  or  higher.  Suppose  that  Rk  represents  the  conceptual  resource  of  class  k.  Rk 
is  initiated  as  P^.  A  workload  (k,  l,  w)  requires  no  conceptual  resource  from  the 
classes  higher  than  k,  but  requires  that  from  every  class  lower  than  or  equal  to 
k.  The  value  of  the  conceptual  resource  requirement  of  a  workload  ( k ,  l,  w)  on 
class  i  is  derived  from  the  worst  case  occupation  in  a  time  interval  of  length  Pj 
by  the  workload. 

If  a  component  Ch  is  admitted,  the  coordinator  establishes  a  server  identified 
with  ( h ,  k,  l)  for  each  workload  ( k ,  l,  w)  in  the  contract.  The  component  to  which 
the  server  belongs  is  identified  by  h,  the  class  of  the  server  is  k,  and  ( k ,  l)  defines 
a  subclass.  All  servers  of  class  i  are  in  a  list  p.  The  server  is  defined  with  a 
budget  limit  w  and  replenishment  period  of  ml .  A  server  have  four  registers, 
load,  carry,  budget  and  replenish. 

Initialization: 


(1)  foreach  0  <  k  <  K  —  1 

(2)  Rk  :=  Pk 

(3)  Lk  is  set  as  an  empty  list 
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Contract  Admission: 


(1)  Upon  component  Ch  proposes  a  contract  Vh,  which  is  a 
list  of  ( k,l,w ) 

(2)  foreach  0  <  i  <  K  —  1 

(3)  R[  :=  Ri 

(4)  foreach  (k,l,w)  €  Vh 

(5)  foreach  k  <  i  <  l 

(6)  R'  :=R'i~w 

(7)  foreach  l  +  l<i<K  —  1 

(8)  Rl-Rl-w  (m*-1) 

(9)  if  3 R'i  <  0 

(10)  reject  component  Ch  and  terminate  this  run  of 
contract  admission; 

(11)  foreach  i  £  [0,if  —  1] 

(12)  Ri  :=  R'i 

(13)  foreach  (. k,l,w )  €  Vh 

(14)  construct  server  ( h,k,l )  and  add  to  the  end  of  Lk, 
with  the  following  initial  values: 

(15)  budget  =  w,  loaded  =  carry  =  0,  replenish  as 
empty  queue. 


Referring  to  the  algorithm  specification  above,  a  component  Ch  may  load  a 
server  ( h ,  k,  l)  by  adding  a  value  to  its  register  load  when  the  component  Ch 
demands  usage  on  the  resource.  If  the  value  of  the  load  register  is  positive,  the 
server  is  loaded.  If  a  loaded  server  has  budget  (budget  >  0),  then  the  budget 
is  consumed  on  the  load  and  all  or  part  of  the  loaded  value  becomes  carried 
( carry  >  0).  At  the  start  of  a  time  unit  (t,t  +  1)  (which  means  t  is  a  non¬ 
negative  integer),  if  class  k  is  the  highest  class  with  a  carried  server,  then  the 
first  carried  server  in  Lj.  supplies  resource  in  the  time  unit  (t,  t  +  1). 

The  existing  budget  of  a  server  is  held  in  budget.  When  load  and  budget  are 
both  positive  and  v  =  min(load,  budget),  both  of  them  are  reduced  by  v  and 
carry  is  increased  by  v.  Consumed  budget  will  be  replenished  after  ml  units  of 
time.  The  queue  replenish  records  the  scheduled  replenishments  in  the  future. 

Online  Execution: 
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(1)  Upon  the  start  of  time  unit  (t,  t  +  1): 

(2)  foreach  server  (h,  k,  l) 

(3)  Replenish  budget: 

(4)  if  the  head  of  queue  in  replenish  is  ( t,val ) 

(5)  budget  :=  budget  +  val 

(6)  dequeue  ( t,val )  from  repenish 

(7)  Carry  work  load: 

(8)  if  load  >  0  and  budget  >  0 

(9)  v  :=  min(load,  budget) 

(10)  carry  :=  carry  +  v 

(11)  budget  :=  budget  —  v 

(12)  load  :=  load  —  v 

(13)  enqueue  (v,t  +  ml)  to  replenish 

(14)  Supply  Resource: 

(15)  Select  server  ( h,k,l ),  such  that  k  is  the  highest  class 
with  at  least  one  carried  server,  and  (h,  k,  l)  is  the  first 
carried  server  in  L/.. 

(16)  carry  :  =  carry  —  1 

(17)  Supply  resource  to  component  Ch  in  time  unit  (t,t  + 
1) 


When  a  component  terminates,  the  coordinator  reclaims  the  conceptual  re¬ 
sources  from  the  component. 

Component  termination: 


(1)  Upon  the  termination  of  component  Ch 

(2)  foreach  (k,l,w)  €  14 

(3)  delete  server  ( h,k,l )  from  Lj. 

(4)  foreach  k  <  i  <1 

(5)  Ri  :=  Ri+w 

(6)  foreach  l  +  l<i<K  —  1 

(7)  Ri  :=  Ri  +  w  ■  (m'“!) 


5.2  Component  Algorithm 

In  the  HCC  approach,  a  component  generates  a  supply  contract,  and  if  admitted, 
it  may  demand  supply  from  its  servers.  Different  algorithms  may  be  applied  for 
different  components  in  a  composition.  We  describe  one  solution  here  as  an 
example. 

Assume  that  there  is  a  component  Ch,  and  its  component  scheduler  is  EDF. 
A  task  ( c,p,d )  is  categorized  to  subclass  ([ logmd\ ,  [logmp\),  and  its  execution 
time  is  added  to  the  weight  w  of  the  workload  with  that  subclass. 
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Supply  Contract  Generation: 


(1)  foreach  ( k ,  l )  such  that  0  <  k  <  l  <  K  —  1 

(2)  wk,i  :=  0 

(3)  foreach  i  £  T/, 

(4)  k  :=  (_logm  i.d\ 

(5)  l  :=  [logra  i.p\ 

(6)  wk,i  ■■=  Wk,i  +  i-c 

(7)  foreach  wk,i  7^  0 

(8)  add  workload  ( k,l,Wk,i )  into  contract  Vk 


At  run  time,  upon  the  arrival  of  a  job  (i,  j),  a  demand  for  resource  supply  is 
added  to  the  server  corresponding  to  task  i  at  the  start  of  the  next  time  unit. 
Online  execution: 


(1)  Initialization: 

(2)  foreach  (k,l) 

(3)  wk,i  :=  0; 

(4)  Upon  the  arrival  of  job  (i,  j), 

(5)  k  :=  Llogra  i.d\ 

(6)  l  :=  Llogra  i.p\ 

(7)  wk,i  :=  wk,i  +  i.c 

(8)  Upon  the  start  of  time  unit  ( t ,  t  +  1) 

(9)  foreach  server  (k,l)  such  that  wk,i  >  0 

(10)  load  :=  load  +  wk,i\ 

(11)  wk,r-=  0; 


5.3  Example 

Having  described  how  HCC  works,  we  illustrate  the  HCC  approach  by  an  exam¬ 
ple  below. 

In  this  example,  we  design  a  system  with  four  components  with  the  following 
specifications. 

—  Component  Co  consists  of  one  task  for  emergency  action  and  2  periodic 
routine  tasks.  The  emergency  action  takes  little  execution  time  and  rarely 
happens,  but  when  a  malfunction  occurs,  the  action  must  be  performed 
immediately.  We  abstract  this  action  by  a  sporadic  task  To  =  (l,oo,  1), 
which  means  that  the  execution  time  and  relative  deadline  are  both  1,  and 
the  minimum  interval  between  consecutive  arrivals  are  infinite.  The  periodic 
routine  tasks  are  given  by  T\  =  (1, 80, 8),  T2  =  (1, 100, 10). 
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—  Component  C\  is  a  group  of  periodic  routine  tasks  defined  as  follows:  T3  = 
(1,3,3),  T4  =  (1,10,10). 

—  Component  C2  is  a  bandwidth-intensive  application,  which  needs  25  percent 
of  the  resource.  It  can  be  modeled  as  T5  =  (16,64,64). 

—  Component  C3  has  one  periodic  task  T$  =  (3,30,30). 

The  value  of  m  and  K  are  arbitrarily  selected  as  2  and  6  by  the  system 
designer,  based  on  estimations  of  the  potential  workloads.  Let  us  apply  the 
contract  generation  as  defined  in  this  paper.  Four  contracts  will  be  produced  as 
follows.  Recall  that  workload  is  defined  as  ( k,l,w ). 

—  Vo  =  {(0, 6, 1),  (3, 6, 2)},  where  To  is  mapped  to  workload  (0, 6, 1),  Tl  and  T2 
are  mapped  to  (3,6,2). 

—  Vi  =  {(1, 1, 1),  (3, 3, 1)},  where  T3  is  mapped  to  (1, 1, 1),  and  T4  is  mapped 
to  (3, 3, 1). 

—  V2  =  {(6,6,16)}. 

—  R3  =  {(4,4,3)}. 

Suppose  that  all  components  become  ready  at  time  0,  and  the  admission 
decisions  are  made  according  to  their  index  order.  For  all  0  <  k  <  6,  remains 
non- negative  when  Co,  Ci,  C2  are  admitted.  However,  during  the  admission  of 
C3,  R'e  <  0,  therefore  C3  is  not  admitted.  Table  1  shows  the  change  of  /f.v 
during  admission  procedure,  and  Table  2  shows  the  established  servers  on  all 
classes  after  that. 


Table  1.  Component  Admission 


|  Component  0 

|  Component  1 

Component  2 

Component  3 

initial 

(0,  6,  1) 

(3,  6,  2) 

(1.  1.  1) 

(3,  3,  1) 

(6,  6,  16) 

(4,  4,  3) 

1 

0 

0 

0 

0 

0 

H 

0 

2 

1 

1 

0 

0 

0 

ms 

0 

4 

3 

3 

1 

1 

1 

n 

1 

8 

7 

5 

1 

0 

0 

m 

0 

16 

15 

13 

5 

3 

3 

ms 

0 

32 

31 

29 

13 

9 

9 

M 

3 

64 

63 

61 

29 

21 

5 

m 

-7 

Assume  that  the  first  job  of  To  arrives  at  time  4  and  the  online  executions 
of  all  components  are  defined  as  in  this  paper.  We  now  show  a  step  by  step 
execution  from  time  0  to  time  4. 

At  time  0,  the  budget  registers  of  all  servers  have  been  initialized  according  to 
their  weights,  and  the  components  add  their  current  demands  to  the  correspond¬ 
ing  load  registers,  as  shown  in  Table  3.  Coordinator  moves  the  in-budget  loads 
into  register  carry,  and  the  consumed  budget  are  recorded  for  replenishments  in 
the  future.  The  carried  value  of  server  (1, 1, 1)  becomes  1.  Server  (0,0,6)  is  not 
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Table  2.  Servers  on  All  Classes 


{(0,0,6)} 

m 

{(1,1,1)} 

{(0,3,6), (1,3,3)} 

m 

{(2,6,6)} 

carried,  therefore  server  (1, 1, 1)  is  selected  to  supply  time  between  time  (0, 1). 
Its  carry  is  then  decremented  back  to  0.  Table  4  shows  the  register  image  after 
the  execution  of  the  coordinator. 


Table  3.  Register  Image  Right  After  Component  Loading  At  Time  0 


budget 

carry 

replenish 

(0,  0,  6) 

1 

m 

0 

(1,  1,  1) 

1 

1 

0 

(0,  3,  6) 

2 

m 

0 

(1,  3,  3) 

1 

1 

0 

(2,  6,  6) 

16 

m 

0 

Table  4.  Register  Image  Right  After  Coordinator  Execution  At  Time  0 


budget 

carry 

replenish 

(0,  0,  6) 

1 

0 

0 

(1,  1,  1) 

0 

0 

0 

{(1,2)} 

(0,  3,  6) 

0 

0 

2 

{(2,64)} 

(1,  3,  3) 

0 

0 

1 

{(1,8)} 

(2,  6,  6) 

0 

0 

16 

{(16,64)} 

Between  time  (0, 1),  no  load  is  added  from  any  component.  At  time  1,  server 
(0, 3, 6)  is  selected  to  supply  between  (1, 2)  so  its  carry  is  decremented,  as  shown 
in  Table  5. 

At  time  2,  server  (1, 1, 1)  replenishes  its  budget,  and  server  (0,  3,  6)  is  selected 
as  supplier  and  so  its  value  of  carry  is  decremented,  as  shown  in  Table  6. 

At  time  3,  the  second  job  of  T3  is  ready,  so  C\  loads  server  (1,1,1)  by  1, 
as  shown  in  Table  7.  On  the  coordinator  side,  budget  is  available  for  server 
(1, 1, 1),  therefore  budget  is  consumed  for  the  load  and  carry  is  incremented  by 
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Table  5.  Register  Image  Right  After  Coordinator  Execution  At  Time  1 


budget 

carry 

replenish 

(0,  0,  6) 

1 

0 

0 

(f.  1,  1) 

0 

0 

0 

1(1,2)} 

(0,  3,  6) 

0 

0 

1 

{(2,64)} 

(1,  3,  3) 

0 

0 

1 

{(1,8)} 

(2,  6,  6) 

0 

0 

16 

{(16,64)} 

Table  6.  Register  Image  Right  After  Coordinator  Execution  At  Time  2 


budget 

carry 

replenish 

(0,  0,  6) 

1 

0 

0 

(1,  1,  1) 

1 

0 

0 

(0,  3,  6) 

0 

0 

0 

{(2,64)} 

(1,  3,  3) 

0 

0 

1 

{(1,8)} 

(2,  6,  6) 

0 

0 

16 

{(16,64)} 

1.  Budget  is  consumed,  and  therefore  future  replenishment  is  added  to  replenish. 
Then  server  (1,1,1)  is  selected  as  supplier,  and  its  carry  is  decremented  by  1. 
Table  8  shows  the  register  image  after  the  coordinator  execution. 


Table  7.  Register  Image  Right  After  Component  Loading  At  Time  3 


budget 

[EH 

carry 

replenish 

(0,  0,  6) 

1 

m 

0 

(1,  1,  1) 

1 

1 

0 

(0,  3,  6) 

0 

m 

0 

{(2,64)} 

(1,  3,  3) 

0 

H 

1 

{(1,8)} 

(2,  6,  6) 

0 

m 

16 

{(16,64)} 

Table  8.  Register  Image  Right  After  Coordinator  Execution  At  Time  3 


budget 

carry 

replenish 

(0,  0,  6) 

1 

0 

0 

(1,  1,  1) 

0 

0 

0 

{(1,5)} 

(0,  3,  6) 

0 

0 

0 

{(2,64)} 

(1,  3,  3) 

0 

0 

1 

{(1,8)} 

(2,  6,  6) 

0 

0 

16 

{(16,64)} 
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At  time  4,  a  job  of  task  To  arrives.  Therefore  server  (0,0,6)  is  loaded  by  1, 
as  shown  in  Table  9.  During  the  coordinator  execution,  budget  is  available  for 
(0,0,6)  and  consumed,  future  replenishment  is  stored,  and  the  value  of  carry 
is  incremented  by  1.  Then  server  (0,0,6)  is  selected  to  supply,  and  its  carry  is 
decremented  back  to  0.  Table  10  shows  the  register  image  after  these  executions. 


Table  9.  Register  Image  Right  After  Component  Loading  At  Time  4 


budget 

carry 

replenish 

(0,  0,  6) 

1 

1 

0 

(b  1,  1) 

0 

m 

0 

{(1,5)} 

(0,  3,  6) 

0 

m 

0 

{(2,64)} 

(1,  3,  3) 

0 

m 

1 

{(1,8)} 

(2,  6,  6) 

0 

m 

16 

{(16,64)} 

Table  10.  Register  Image  Right  After  Coordinator  Execution  At  Time  4 


budget 

carry 

replenish 

(0,  0,  6) 

0 

0 

0 

{(1,68)} 

(1,  1,  1) 

0 

0 

0 

{(1,5)} 

(0,  3,  6) 

0 

0 

0 

{(2,64)} 

(1,  3,  3) 

0 

0 

1 

{(1,8)} 

(2,  6,  6) 

0 

0 

16 

{(16,64)} 

It  is  noteworthy  that  a  simple  fixed-priority  composition  scheme  cannot  even 
compose  Co  and  C\  together  for  the  following  reason.  Because  of  the  short  dead¬ 
line  of  task  To,  Co  must  have  the  highest  priority.  Then  there  is  a  possibility 
that  3  continuous  time  units  may  be  supplied  to  Co,  in  which  case  task  T3  in 
C\  may  miss  its  deadline.  The  low  composability  is  a  result  of  not  distinguish¬ 
ing  the  different  types  of  workloads  in  Co-  In  contrast,  by  Harmonic  scheduler 
composition,  Co,  C\  and  C2  can  be  admitted  one  by  one  and  served  in  the  same 
time. 


5.4  Analysis 

If  a  component  Ch  is  admitted  by  the  coordinator,  then  the  coordinator  will 
supply  resources  to  Ch  according  to  the  supply  contract  14.  Assuming  that 
there  is  a  workload  ( k,l,w )  in  I 7h,  then  a  server  ( h,k,l )  is  established.  Within 
any  time  interval  of  length  ml,  up  to  w  time  units  of  supply  may  be  loaded  to 
the  server,  and  every  demand  will  obtain  supply  within  mk  units  of  time  since 
the  demand  is  loaded.  We  call  this  the  responsiveness  guarantee.  However,  if 
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the  accumulated  load  exceeds  w  time  units  within  a  time  interval  of  length  rnl , 
the  server  is  overloaded  and  the  responsiveness  guarantee  will  not  be  provided 
anymore.  The  rationale  here  is  that  if  the  component  breaks  the  supply  contract 
by  overloading,  the  coordinator  cannot  guarantee  prompt  supply.  On  the  other 
hand,  A  non-overloaded  server  always  provides  the  responsiveness  guarantee, 
even  when  other  servers  (including  other  servers  of  the  same  component)  are 
overloaded.  We  shall  prove  the  responsiveness  guarantee. 

First,  we  prove  that  in  a  non-overloaded  server,  load  never  waits  for  budget. 

Lemma  1  For  a  non-overloaded  server  ( h,k,l ),  load  <  budget  at  any  non¬ 
negative  integer  time  t  after  budget  replenishment. 

Proof:  Base  Case:  At  time  0,  Register  budget  is  initialized  to  w,  and  a  non¬ 
overloading  component  loads  less  than  or  equal  to  w  at  time  0.  The  lemma  is 
true. 

Induction  case:  Assume  that  for  any  non-negative  integer  t  <  n,  the  lemma 
is  true.  We  now  prove  that  the  lemma  is  still  true  at  time  n  + 1  by  contradiction. 

Assume  the  contrary:  The  value  of  load  and  the  value  of  budget  at  time  n  + 1 
after  replenishment  is  x  and  y,  and  x  >  y. 

Let  n'  =  roa.r(0,  (n  +  1  —  m1)).  Assume  that  the  budget  consumed  after  time 
n'  but  before  or  at  time  n  is  z,  then  y  +  z  =  w, 

Because  the  lemma  is  true  at  time  n',  all  loads  arrived  before  or  equal  to 
time  n'  are  carried  before  or  at  time  n',  so  budget  consumed  between  (n',n)  is 
for  load  arrived  after  n'  and  before  or  at  time  n.  Because  the  lemma  is  true  for 
time  n,  load  is  decreased  to  0  after  the  execution  of  the  coordinator  at  time  n. 
Therefore,  the  aggregate  load  after  time  n'  and  before  or  at  time  n  is  equal  to 
the  budget  consumption  during  the  the  same  interval  of  time,  which  is  z. 

Also,  the  aggregate  arrival  of  load  after  time  n  but  before  or  at  time  n  +  1 
is  x.  The  aggregate  arrival  of  load  after  time  n'  and  before  or  at  time  n  +  1  is 
x  +  z.  Thus  x  +  z>y  +  z=w,  and  the  server  is  overloaded,  a  contradiction.  ■ 

A  non-negative  integer  time  t  is  class  k  un-carried  if  all  servers  of  class  k  or 
higher  have  zero  value  for  carry  before  the  coordinator  execution  at  time  t.  At 
a  class  k  un-carried  time  t,  all  previously  loaded  in-budget  work  for  servers  of 
class  k  or  higher  is  completely  supplied. 

Lemma  2  If  t  is  a  class  k  un-carried  time,  then  there  exists  another  class  k 
un-carried  time  t1  such  that  t1  <t  +  mk . 

Proof:  According  to  the  admission  control  algorithm,  the  aggregation  of  exist¬ 
ing  budget  from  all  servers  of  class  k  or  higher  at  time  t  before  the  coordinator 
execution  and  replenishment  at  or  after  time  t  and  before  t  +  mk  will  not  exceed 
Pk  =  rnk .  Therefore,  the  maximal  aggregate  value  that  can  be  added  to  carry 
of  all  servers  of  class  k  or  higher  will  not  exceed  mk.  At  any  integer  time  t,  if 
there  exists  a  server  of  class  k  or  higher  with  carry  >  0,  a  supply  is  drawn  from 
a  server  with  class  k  or  higher  made  and  a  carry  is  decreasing.  If  t1  does  not 
exist  after  time  t  and  before  time  t  +  rnk,  then  carry  is  decreased  by  mk  at  or 
after  time  t  and  before  t  +  mk,  and  time  t  +  rnk  must  be  an  class  k  un-carried 
time.  Therefore  the  lemma  holds.  ■ 
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Theorem  2  If  server  ( h ,  k,  l)  is  not  overloaded  at  any  time,  it  provides  the 
responsiveness  guarantee. 

Proof:  Time  0  is  a  class  k  un-carried  time.  According  to  Lemma  2,  at  any 
time  t,  there  exists  another  un-carried  time  t'  for  class  k  before  or  at  time  t  +  mk. 
According  to  Lemma  1,  if  component  Ch  adds  load  at  time  t,  the  complete  load  is 
moved  to  carry  at  time  t.  Because  carry  =  0  at  time  t1 ,  the  supply  corresponding 
to  the  demand  loaded  at  time  t  is  made  before  time  t' .  Therefore  responsiveness 
guarantee  is  maintained.  ■. 

The  computational  complexity  of  admission  for  a  component  Ch  is  bounded 
by  0 (K  •  |  Vh  | ) ,  where  K  is  the  maximal  number  of  classes,  and  \  Vh  \  is  the  number 
of  workloads  in  the  contract  which  is  bounded  by  K2.  The  online  coordinator 
overhead  for  each  time  unit  is  bounded  by  0(n  ■  s),  where  n  is  the  number  of 
components  and  s  is  the  maximal  number  of  servers  for  a  component  which  is 
bounded  by  K 2 .  Because  the  period  of  classes  increases  exponentially,  K  should 
be  a  small  number. 

6  Comparison  with  Related  Work 

There  has  been  a  significant  amount  of  work  on  compositions  in  the  last  few 
years  as  has  been  pointed  out  in  Section  1  of  this  paper.  Instead  of  using  EDF 
online  for  scheduling  resource  supply  among  components  such  as  is  in  [2]  and  [4], 
our  HCC  approach  distinguishes  itself  from  these  previous  works  by  using  a  rate 
monotonic  classification  of  workloads;  the  coordinator  applies  a  fixed  priority 
policy  among  workload  classes.  The  urgency  of  workloads  from  components  is 
expressed  by  their  classes  instead  of  explicit  deadlines.  The  rate  monotonic  design 
of  HCC  makes  admission  control  and  budget  management  simple,  yet  maintains 
good  composability. 

Many  hard  and/or  soft  real-time  scheduling  approaches  depend  on  a  server 
budget  to  control  the  resource  supply  to  a  component  to  maintain  a  fair  share. 
Total  Bandwidth  Server  [6]  is  one  example  of  this  approach.  Like  servers,  HCC 
also  makes  use  of  the  budget  idea.  Because  HCC  is  not  deadline-based  and 
temporal  workload  control  depends  totally  on  budget  control,  HCC  does  not 
require  as  much  communication  (e.g.,  deadlines  of  newly  arrived  jobs)  between 
the  system-level  scheduler  and  the  component  schedulers  and  is  hence  a  less 
costly  and  easier  to  implement  budget-enforcement  strategy. 

Cayssials  et  al.  proposed  an  approach  to  minimize  the  number  of  priorities  in 
a  rate- monotonic  fixed  priority  scheme  [1].  In  their  approach,  multiple  tasks  are 
grouped  into  a  class  and  scheduled  on  the  same  priority  level.  Although  HCC 
solves  a  different  problem  and  its  design  is  significantly  different  from  Cayssials 
et  al,  both  algorithms  exploit  the  idea  of  classification  of  workloads. 

7  Future  Work 

Whereas  the  Harmonic  Component  Composition  is  a  dynamic  approach  in  which 
the  coordinator  does  not  depend  on  internal  knowledge  of  components,  we  are 
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also  investigating  another  approach  to  composition  that  improves  composability 
and  online  resource  supply  efficiency  by  exploiting  a  priori  knowledge  of  the  com¬ 
ponents.  Unlike  the  approach  described  in  this  paper,  this  alternative  approach 
requires  extensive  offline  computation.  We  believe  that  these  two  composition 
approaches  span  the  two  far  ends  of  a  wide  spectrum  of  practical  solutions  for 
composing  real-time  schedulers.  There  is  still  much  to  be  explored  in  the  spec¬ 
trum  of  solutions  by  a  combination  of  the  approaches.  This  is  a  subject  for 
further  investigation. 
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Abstract.  We  investigate  the  combination  of  propositional  SAT  checkers  with 
domain-specific  theorem  provers  as  a  foundation  for  bounded  model  checking 
over  infinite  domains.  Given  a  program  M  over  an  infinite  state  type,  a  linear 
temporal  logic  formula  ip  with  domain-specific  constraints  over  program  states, 
and  an  upper  bound  k,  our  procedure  determines  if  there  is  a  falsifying  path  of 
length  k  to  the  hypothesis  that  M  satisfies  the  specification  ip.  This  problem  can 
be  reduced  to  the  satisfiability  of  Boolean  constraint  formulas.  Our  verification 
engine  for  these  kinds  of  formulas  is  lazy  in  that  propositional  abstractions  of 
Boolean  constraint  formulas  are  incrementally  refined  by  generating  lemmas  on 
demand  from  an  automated  analysis  of  spurious  counterexamples  using  theorem 
proving.  We  exemplify  bounded  model  checking  for  timed  automata  and  for  RTL 
level  descriptions,  and  investigate  the  lazy  integration  of  SAT  solving  and  theo¬ 
rem  proving. 


1  Introduction 

Model  checking  decides  the  problem  of  whether  a  system  satisfi  es  a  temporal  logic 
property  by  exploring  the  underlying  state  space.  It  applies  primarily  to  fi  nite-state  sys¬ 
tems  but  also  to  certain  infi  nite-state  systems,  and  the  state  space  can  be  represented 
in  symbolic  or  explicit  form.  Symbolic  model  checking  has  traditionally  employed  a 
boolean  representation  of  state  sets  using  binary  decision  diagrams  (BDD)  [4]  as  a  way 
of  checking  temporal  properties,  whereas  explicit-state  model  checkers  enumerate  the 
set  of  reachable  states  of  the  system. 

Recently,  the  use  of  Boolean  satisfi  ability  (SAT)  solvers  for  linear-time  temporal 
logic  (LTL)  properties  has  been  explored  through  a  technique  known  as  bounded  model 
checking  (BMC)  [7].  As  with  symbolic  model  checking,  the  state  is  encoded  in  terms 

*  This  research  was  supported  by  SRI  International  internal  research  and  development,  the 
DARPA  NEST  program  through  Contract  F33615-01-C-1908  with  AFRL,  and  the  National 
Science  Foundation  under  grants  CCR-00-86096  and  CCR-0082560. 

**  Also  affiliated  with  University  of  Ulm,  Germany. 
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of  booleans.  The  program  is  unrolled  a  bounded  number  of  steps  for  some  bound  k, 
and  an  LTL  property  is  checked  for  counterexamples  over  computations  of  length  k. 
For  example,  to  check  whether  a  program  with  initial  state  I  and  next-state  relation  T 
violates  the  invariant  Inv  in  the  fi  rst  k  steps,  one  checks,  using  a  SAT  solver: 

I(s0)  AT(s0,Si)  AT(«i,  s2)  A  ...  A  T(sk-i,  sk)  A  (-■/««  (s0)  V  ...  V  ^Inv(sk)). 

This  formula  is  satisfi  able  if  and  only  if  there  exists  a  path  of  length  at  most  k  from  the 
initial  state  so  which  violates  the  invariant  Inv.  For  fi  nite  state  systems,  BMC  can  be 
seen  as  a  complete  procedure  since  the  size  of  counterexamples  is  essentially  bounded 
by  the  diameter  of  the  system  [3],  It  has  been  demonstrated  that  BMC  can  be  more 
effective  in  falsifying  hypotheses  than  traditional  model  checking  [7, 8], 

It  is  possible  to  extend  the  range  of  BMC  to  infi  nite-state  systems  by  encoding 
the  search  for  a  counterexample  as  a  satisfi  ability  problem  for  the  logic  of  Boolean 
constraint  formulas.  For  example,  the  BMC  problem  for  timed  automata  can  be  cap¬ 
tured  in  terms  of  a  Boolean  formula  with  linear  arithmetic  constraints.  But  the  method 
presented  here  scales  well  beyond  such  simple  arithmetic  clauses,  since  the  main  re¬ 
quirement  on  any  given  constraint  theory  is  the  decidability  of  the  satisfi  ability  problem 
on  conjunctions  of  atomic  constraints.  Possible  constraint  theories  include,  for  exam¬ 
ple,  linear  arithmetic,  bitvectors,  arrays,  regular  expressions,  equalities  over  terms  with 
uninterpreted  function  symbols,  and  combinations  thereof  [20,24]. 

Whereas  BMC  over  fi  nite-state  systems  deals  with  fi  nding  satisfying  Boolean  as¬ 
signments,  its  generalization  to  infi  nite-state  systems  is  concerned  with  satisfi  ability  of 
Boolean  constraint  formulas.  In  initial  experiments  with  PVS  [21]  strategies,  based  on 
a  combination  of  BDDs  for  propositional  reasoning  and  a  variant  of  loop  residue  [27] 
for  arithmetic,  we  were  usually  only  able  to  construct  counterexamples  of  small  depths 
(<  5).  Clearly,  more  specialized  verifi  cation  techniques  are  needed.  Since  BMC  prob¬ 
lems  are  often  propositionally  intensive,  it  seems  to  be  more  effective  to  augment  SAT 
solvers  with  theorem  proving  capabilities,  such  as  ICS  [10],  than  add  propositional 
search  capabilities  to  theorem  provers. 

Here,  we  look  at  the  specifi  c  combination  of  SAT  solvers  with  decision  procedures, 
and  we  propose  a  method  that  we  call  lemmas  on  demand,  which  invokes  the  theorem 
prover  lazily  in  order  to  effi  ciently  prune  out  spurious  counterexamples,  namely,  coun¬ 
terexamples  that  are  generated  by  the  SAT  solver  but  discarded  by  the  theorem  prover 
by  interpreting  the  propositional  atoms.  For  example,  the  SAT  solver  might  yield  the 
satisfying  assignment  p,  -i q,  where  the  propositional  variable  p  represents  the  atom 
x  =  y,  and  q  represents  f(x)  =  f(y).  A  decision  procedure  can  easily  detect  the  in¬ 
consistency  in  this  assignment.  More  importantly,  it  can  be  used  to  generate  a  set  of 
conflicting  assignments  that  can  be  used  to  construct  a  lemma  that  further  constrains 
the  search.  In  the  above  example,  the  lemma  ->p  V  q  can  be  added  as  a  new  clause  in  the 
input  to  the  SAT  solver.  This  process  of  refi  ning  Boolean  formulas  is  similar  in  spirit 
to  the  refi  nement  of  abstractions  based  on  the  analysis  of  spurious  counterexamples  or 
failed  proof  attempts  [26,25,6, 16,9, 14, 17]. 

From  a  set  of  inconsistent  constraints  in  a  spurious  counterexample  we  obtain  an 
explanation  as  an  overapproximation  of  the  minimal,  inconsistent  subset  of  these  con¬ 
straints.  The  smaller  the  explanation  that  is  generated  from  a  spurious  counterexample, 
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the  greater  the  pruning  in  the  subsequent  search.  In  this  way,  the  computation  of  expla¬ 
nations  accelerates  the  convergence  of  our  procedure. 

Altogether,  we  present  a  method  for  bounded  model  checking  over  infi  nite-state 
systems  that  consists  of: 

-  A  reduction  to  the  satisfi  ability  problem  for  Boolean  constraint  formulas. 

-  A  lazy  combination  of  SAT  solving  and  theorem  proving. 

-  An  effi  cient  method  for  constructing  small  explanations. 

In  general,  BMC  over  infi  nite-state  systems  is  not  complete,  but  we  obtain  a  complete¬ 
ness  result  for  BMC  problems  with  invariant  properties.  The  main  condition  on  con¬ 
straints  is  that  the  satisfi  ability  of  the  conjunction  of  constraints  is  decidable.  Thus,  our 
BMC  procedure  can  be  applied  to  infi  nite-state  systems  even  when  the  (more)  general 
model-checking  problem  is  undecidable. 

The  paper  is  structured  as  follows.  In  Section  2  we  provide  some  background  mate¬ 
rial  on  Boolean  constraints.  Section  3  lays  the  foundation  of  a  refi  nement-based  satis¬ 
fi  ability  procedure  for  Boolean  constraint  logic.  Next,  Section  4  presents  the  details  of 
BMC  over  domain-specifi  c  constraints,  and  Section  5  discusses  some  simple  examples 
for  BMC  over  clock  constraints  and  the  theory  of  bitvectors.  In  Section  6  we  exper¬ 
imentally  investigate  various  design  choices  in  lazy  integrations  of  SAT  solvers  with 
theorem  proving.  Finally,  in  Sections  7  and  8  we  compare  with  related  work  and  we 
draw  conclusions. 

2  Background 

A  set  of  variables  V  :=  {xi, . . . ,  xn }  is  said  to  be  typed  if  there  are  nonempty  sets  D  , 
through  Dn  and  a  type  assignment  r  such  that  r(xi)  =  Dj.  For  a  set  of  typed  variables 
V,  a  variable  assignment  is  a  function  v  from  variables  x  £  V  to  an  element  of  t(x). 

Let  V  be  a  set  of  typed  variables  and  L  be  an  associated  logical  language.  A  set  of 
constraints  in  L  is  called  a  constraint  theory  C  if  it  includes  constants  true,  false  and  if 
it  is  closed  under  negation;  a  subset  of  C  of  constraints  with  free  variables  in  V'  C  V  is 
denoted  by  C(V').  For  c  £  C  and  v  an  assignment  for  the  free  variables  in  c,  the  value 
of  the  predicate  [c]  is  called  the  interpretation  of  c  w.r.t.  a.  Hereby,  JtrueJ^  ( Ifalse}^) 
is  assumed  to  hold  for  all  (for  no)  a,  and  [ — icj  holds  iff  [c]  does  not  hold.  A  set  of 
constraints  C  C  C  is  said  to  be  satisfable  if  there  exists  a  variable  assignment  v  such 
that  [c]  holds  for  every  c  in  C;  otherwise,  C  is  said  to  be  unsatisfiable.  Furthermore,  a 
function  C-sat(C)  is  called  a  C-satisfi  ability  solver  if  it  returns  ±  if  the  set  of  constraints 
C  is  unsatisfi  able  and  a  satisfying  assignment  for  C  otherwise. 

For  a  given  theory  C,  the  set  of  boolean  constraints  Bool(C)  includes  all  constraints 
in  C  and  it  is  closed  under  conjunction  A ,  disjunction  V ,  and  negation  -i.  The  no¬ 
tions  of  satisfi  ability,  inconsistency,  satisfying  assignment,  and  satisfi  ability  solver  are 
homomorphic  ally  lifted  to  the  set  of  boolean  constraints  in  the  usual  way.  If  V  = 
{pi, . . .  ,pn}  and  the  corresponding  type  assignment  r(pi)  is  either  true  or  false,  then 
Boo\({true,  false}  U  V)  reduces  to  the  usual  notion  of  Boolean  logic  with  proposi¬ 
tional  variables  {pi, . . . . pn  } .  We  call  a  Boolean  satisfi  ability  solver  also  a  SAT  solver. 
N- ary  disjunctions  of  constraints  are  also  referred  to  as  clauses ,  and  a  formula  ip  £ 
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Bool  (C(V))  is  in  conjunctive  normal  form  (CNF)  if  it  is  an  n-ary  conjunction  of  clauses. 
There  is  a  linear-time  satisfi  ability -preserving  transformation  into  CNF  [22]. 


3  Lazy  Theorem  Proving 

Satisfi  ability  solvers  for  propositional  constraint  formulas  can  be  obtained  from  the 
combination  of  a  propositional  SAT  solver  with  decision  procedures  simply  by  convert¬ 
ing  the  problem  into  disjunctive  normal  form,  but  the  result  is  prohibitively  expensive. 
Here,  we  lay  out  the  foundation  of  a  lazy  combination  of  SAT  solvers  with  constraint 
solvers  based  on  an  incremental  reft  nement  of  Boolean  formulas.  We  restrict  our  anal¬ 
ysis  to  formulas  in  CNF,  since  most  modern  SAT  solvers  expect  their  input  to  be  in  this 
format. 

Translation  schemes  between  propositional  formulas  and  Boolean  constraint  for¬ 
mulas  are  needed.  Given  a  formula  p  such  a  correspondence  is  easily  obtained  by 
abstracting  constraints  in  p  with  (fresh)  propositional  variables.  More  formally,  for 
a  formula  p  £  Bool(C)  with  atoms  C  =  {ci, . . . ,  cn)  £  C  and  a  set  of  proposi¬ 
tional  variables  P  =  {pi . . . . .  pn }  not  occurring  in  p,  the  mapping  a  from  Boolean 
formulas  over  {ci, . . . ,  cn }  to  Boolean  formulas  over  P  is  deft  ned  as  the  homomor¬ 
phism  induced  by  a{cf)  =  pi-  The  inverse  7  of  such  an  abstraction  mapping  a  simply 
replaces  propositional  variables  pi  with  their  associated  constraints  c-L.  For  example, 
the  formula  p  =  f(x)  ^  x/\f(f(x))  =  x  over  equalities  of  terms  with  uninter¬ 
preted  function  symbols  determines  the  function  a  with,  say,  a(f(x)  f  x)  =  Pi 
and  a(f(f(x))  =  x)  =  P2',  thus  a(p)  =  p\  / \p2 .  Moreover,  a  Boolean  assignment 
v  :  P  — >  {true,  false]  induces  a  set  of  constraints 

7 (v)  =  {c  £  C  |  3i.  if  v(pi)  =  true  then  c  =  7 {pf)  else  c  =  -17 (pi)}  ■ 

Now,  given  a  Boolean  variable  assignment  v  such  that  v{p\)  =  false  and  ^(p2)  =  true, 
p(y)  is  the  set  of  constraints  {f(x)  =  x,  f(f(x))  =  x}.  A  consistent  set  of  constraints 
C  determines  a  set  of  assignments.  For  choosing  an  arbitrary,  but  fi  xed  assignment  from 
this  set,  we  assume  as  given  a  function  choose(C). 

Theorem  1.  Let  p  £  Bool(C)  be  a  formula  in  CNF,  C,  be  the  literals  in  a(p),  and 
I(p)  :=  {L  C  £  \  7 (L)  is  C- inconsistent)  be  the  set  of  C-inconsistencies  for  p\  then:  p 
is  C-satisfiable  iff  the  following  Boolean  formula  is  satisfi  able: 

a(p)A(  (-1/1  V  . . .  V-i/„)). 

Thus,  every  Bool(C)  formula  can  be  transformed  into  an  equisatisfi  able  Boolean  for¬ 
mula  as  long  as  the  consistency  problem  for  sets  of  constraints  in  C  is  decidable. 
This  transformation  enables  one  to  use  off-the-shelf  satisfi  ability  checkers  to  determine 
the  satisfi  ability  of  Boolean  constraint  formulas.  On  the  other  hand,  the  set  of  liter¬ 
als  is  exponential  in  the  number  of  variables  and,  therefore,  an  exponential  number  of 
C-inconsistency  checks  is  required  in  the  worst  case.  It  has  been  observed,  however,  that 
in  many  cases  only  small  fragments  of  the  set  of  C-inconsistencies  are  needed. 
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sat  (tp) 

P  ■-  a(<p) ! 

loop 

v  B-sat(p)\ 

if  v  —  T  then  return  _L; 

if  C-sat(^(v))  ^  _L  then  return  choose (7(1^)); 
/:=  \J  -.q(c);  p  \=p/\I 

endloop 

Fig.  1.  Lazy  theorem  proving  for  Bool(C). 


Starting  with  p  =  a  (ip),  the  procedure  sat(^)  in  Figure  1  realizes  a  guided  enu¬ 
meration  of  the  set  of  C-inconsistencies.  In  each  loop,  the  SAT  solver  B-sat  suggests 
a  candidate  assignment  v  for  the  Boolean  formula  p,  and  the  satisfi  ability  solver  C-sat 
for  C  checks  whether  the  corresponding  set  of  constraints  7(1')  is  consistent.  Whenever 
this  consistency  check  fails,  p  is  refi  ned  by  adding  a  Boolean  analogue  I  of  this  in¬ 
consistency,  and  B-sat  is  applied  to  suggest  a  new  candidate  assignment  for  the  refi  ned 
formula  p  A  I.  This  procedure  terminates,  since,  in  every  loop,  /  is  not  subsumed  by  p, 
and  there  are  only  a  fi  nite  number  of  such  strengthenings. 

Corollary  1.  sat  (ip)  in  Figure  1  is  a  satisfi  ability  solver  for  Bool(C)  formulas  in  CNF. 

We  list  some  essential  optimizations.  If  the  variable  assignments  returned  by  the  SAT 
solver  are  partial  in  that  they  include  don ’t  care  values,  then  the  number  of  argument 
constraints  to  C-sat  can  usually  be  reduced  considerably.  The  use  of  don’t  care  values 
also  speeds  up  convergence,  since  more  general  lemmas  are  generated.  Now,  assume  a 
function  explain  (C),  which,  for  an  inconsistent  set  of  constraints  C,  returns  a  minimal 
number  of  inconsistent  constraints  in  C  or  a  “good”  overapproximation  thereof.  The 
use  of  explain(C)  instead  of  the  stronger  C  obviously  accelerates  the  procedure.  We 
experimentally  analyze  these  effi  ciency  issues  in  Section  6. 


4  Infinite-State  BMC 

Given  a  BMC  problem  for  an  infi  nite-state  program,  an  LTL  formula  with  constraints, 
and  a  bound  on  the  length  of  counterexamples  to  be  searched  for,  we  describe  a  sound 
reduction  to  the  satisfi  ability  problem  of  Boolean  constraint  formulas  and  we  show 
completeness  for  invariant  properties.  The  encoding  of  transition  relations  follows  the 
now-standard  approach  already  taken  in  [13].  Whereas  in  [7]  LTL  formulas  are  trans¬ 
lated  directly  into  propositional  formulas,  we  use  Biichi  automata  for  this  encoding. 
This  simplifi  es  substantially  the  notations  and  the  proofs,  but  a  direct  translation  can 
sometimes  be  more  succinct  in  the  number  of  variables  needed.  We  use  the  common 
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x'  x  +  m 


true 


x  >  0, 

x'  x  —  m  —  1 


Fig.  2.  The  simple  example. 


notions  for  fi  nite  automata  over  fi  nite  and  infi  nite  words,  and  we  assume  as  given  a 
constraint  theory  C  with  a  satisfi  ability  solver. 

Typed  variables  in  V  :=  {xi . . . . ,  xn  }  are  also  called  state  variables ,  and  a  program 
state  is  a  variable  assignment  over  V.  A  pair  {I,  T)  is  a  C-program  over  V  if  I  € 
Bool(C(F))  and  T  £  Bool(C(V  U  V ')),  where  V'  is  a  primed,  disjoint  copy  of  V.  I  is 
used  to  restrict  the  set  of  initial  program  states,  and  T  specifi  es  the  transition  relation 
between  states  and  their  successor  states.  The  set  of  C-programs  over  V  is  denoted  by 
Prg(C(V)).  The  semantics  of  a  program  P  is  given  in  terms  of  a  transition  system  M 
in  the  usual  way,  and,  by  a  slight  abuse  of  notation,  we  sometimes  write  M  for  both 
the  program  and  its  associated  transition  system.  The  system  depicted  in  Figure  2,  for 
example,  is  expressed  in  terms  of  the  program  {I,  T)  over  {x,  /},  where  the  counter  x  is 
interpreted  over  the  integers  and  the  variable  l  for  encoding  locations  is  interpreted  over 
the  Booleans  (the  n-ary  connective  ®  holds  iff  exactly  one  of  its  arguments  holds). 

I(x,l)  :=  x  >  0  A  l 

T(x,l,x',l')  :=  (l  A  x'  =  x  +  m  A  ->Z')  ® 

(-i  l  Ai>0Ai'  =  i-m-lA  -i l')  ®  (-i  /  A  x'  =  x  A  V) 

Initially,  the  program  is  in  location  l  and  x  is  greater  than  or  equal  to  0,  and  the  tran¬ 
sitions  in  Figure  2  are  encoded  by  a  conjunction  of  constraints  over  the  current  state 
variables  x,  l  and  the  next  state  variables  x' .  V . 

The  formulas  of  the  constraint  linear  temporal  logic  LTL(C)  (in  negation  normal 
form)  are  linear-time  temporal  logic  formulas  with  the  usual  “next”,  “until”,  and  “re¬ 
lease”  operators,  and  constraints  c  £  C  as  atoms. 

p  ::=  true  \  false  |  c  |  p\  A  P2  |  pi  V  P2  \  X  p  \  pi  U  pi  |  p\  R  pi 

The  formula  X  p  holds  on  some  path  7t  iff  p  holds  in  the  second  state  of  n.  pi  U  pi 
holds  on  7 r  if  there  is  a  state  on  the  path  where  p2  holds,  and  at  every  preceding  state 
on  the  path  pi  holds.  The  release  operator  R  is  the  logical  dual  of  U.  It  requires  that  p2 
holds  along  the  path  up  to  and  including  the  fi  rst  state,  where  py  holds.  However,  pi 
is  not  required  to  hold  eventually.  The  derived  operators  F  p  =  true  U  p  and  G  p  = 
false  R  p  denote  “eventually  p"  and  “globally  p”.  Given  a  program  M  £  Prg(C)  and 
a  path  7t  in  M,  the  satisfi  ability  relation  M ,  n  \=  p  for  an  LTL(C)  formula  p  is  given 
in  the  usual  way  with  the  notable  exception  of  the  case  of  constraint  formulas  c.  In  this 
case,  M,  n  |=  c  if  and  only  if  c  holds  in  the  start  state  of  tt.  Assuming  the  notation 
above,  the  C-model  checking  problem  M  \=  p  holds  iff  for  all  paths  n  =  so,  si, . . .  in 
M  with  so  €  /  it  is  the  case  that  M,  n  \=  p.  Given  a  bound  k,  a  program  M  £  Prg(C) 
and  a  formula  p  £  LTL(C)  we  now  consider  the  problem  of  constructing  a  formula 
[M,  £  Bool(C),  which  is  satisfi  able  if  and  only  if  there  is  a  counterexample  of 
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length  k  for  the  C-model  checking  problem  M  |=  ip.  This  construction  proceeds  as 
follows. 

1 .  Deli  nition  of  \M\  as  the  unfolding  of  the  program  M  up  to  step  k  from  initial 
states  (this  requires  k  disjoint  copies  of  V). 

2.  Translation  of  -n p  into  a  corresponding  Biichi  automaton  whose  language  of 
accepting  words  consists  of  the  satisfying  paths  of  —up. 

3.  Encoding  of  the  transition  system  for  and  the  Biichi  acceptance  condition  as  a 
Boolean  formula,  say  [£>]fc. 

4.  Forming  the  conjunction  [M,  p\k  :=  [£?]*,  A  [M]fc. 

5.  A  satisfying  assignment  for  the  formula  [M,  induces  a  counterexample  of 
length  k  for  the  model  checking  problem  M  \=  <p. 

Definition  1  (Encoding  of  C-Programs).  The  encoding  [M]fc  of  the  fcth  unfolding  of 
a  C-program  M  =  (I,T)  in  Prg(C({xi, . . .  ,£„}))  is  given  by  the  Bool(C)  formula 

/o(x[0])  :=  I({xi  i->  x,;[0]  |  Xi  €  V}) 

Tj(x[j\,x\j  +  1])  :=  T({xi  t->  Xi[j]  |  Xi  £  V}  U  {x-  H>  Xi[j  +  1]  |  Xi£V}) 

k- 1 

lM\k  ■=  Jo(a:[0])  A  / \  Tj(x[j],  x[j  +  1]) 
j= o 

where  {xj  [j]  |  0  <  j  <  k\  is  a  family  of  typed  variables  for  encoding  the  state  of 
variable  Xi  in  the  jth  step,  x[j]  is  used  as  an  abbreviation  for  Xi[j], . . .  ,xn[j],  and 
T(xi  Xi[j] )  denotes  simultaneous  substitution  of  Xi  by  Xi[j]  in  formula  T. 

A  two-step  unfolding  of  the  simple  program  in  Figure  2  is  encoded  by  [simpZe]2  := 
/o  A  T0  A  Ti  (*). 

I0  :=  x[0]  >  0  A  /[ 0] 

T0  :=  (/[0]  A  (x[l]  =  x[0]  +  m)  A  -i([l])<g> 

(  -i/[0]  A  (x[0]  >  0)  A  (x[l]  =  x[0]  —  m  —  1)  A  -i/[l] )  ® 

( — '/[0]  A  (x[l]  =  x[0])  A  ([1]) 

Ti  :=  (/[l]  A  (x[2]=x[l \+m)  A  — 1/[2] )  <g) 

( -i/[l]  A  (x[l]  >  0)  A  (x[2]  =  x[l]  —  m  —  1)  A  -i/[2] )  ® 

(-/[  1]  A  (*[2]=*[1])  A  ([2] ) 

The  translation  of  linear  temporal  logic  formulas  into  a  corresponding  Biichi  au¬ 
tomaton  is  well-studied  in  the  literature  [11]  and  does  not  require  additional  explana¬ 
tion.  Notice,  however,  that  the  translation  of  LTL(C)  formulas  yields  Biichi  automata 
with  C-constraints  as  labels.  Both  the  resulting  transition  system  and  the  bounded  ac¬ 
ceptance  test  based  on  the  detection  of  reachable  cycles  with  at  least  one  fi  nal  state  can 
easily  be  encoded  as  Bool(C)  formulas. 
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x  >  0 


Fig.  3.  Automaton  for  F  (x  <  0). 


Definition  2  (Encoding  ofBiichi  Automata).  Let  V  =  {xi, . . . ,  xn}  be  a  set  of  typed 
variables,  B  =  (E,  Q,  A,  Q°,  F)  be  a  Biichi  automaton  with  labels  E  in  Bool(C),  and 
pc  be  a  variable  (not  in  V),  which  is  interpreted  over  the  fi  nite  set  of  locations  Q  of 
the  Biichi  automaton.  For  a  given  integer  k,  we  obtain,  as  in  Defi  nition  1 ,  families  of 
variables  X{  [j],  pc[j]  (1  <  i  <  n,  0  <  j  <  k)  for  representing  the  jth  state  of  B  in 
a  run  of  length  k.  Furthermore,  the  transition  relation  of  B  is  encoded  in  terms  of  the 
C-program  Bm  over  the  set  of  variables  {pc}  U  V ,  and  [8m]j.  denotes  the  encoding  of 
this  program  as  in  Defi  nition  1 .  Now,  given  an  encoding  of  the  acceptance  condition 

k — 1  n  k 

acc(B)k  :=  \J  [pc[k]  =  pc[j ]  A  /\  xv[k]  =  xv[j]  A  (  \J  \J  pc[l]  =  /)) 
j= o  f=i  i=j+ i feF 

the  fc-th  unfolding  of  B  is  defi  ned  by  [^J,  :=  A  acc(B)k. 

An  LTL(C)  formula  is  said  to  be  R-free  (U-free)  iff  there  is  an  equivalent  formula 
(in  negation  normal  form)  not  containing  the  operator  R  (U).  Note  that  U-free  formulas 
correspond  to  the  notion  of  syntactic  safety  formulas  [28, 15],  Now,  it  can  be  directly 
observed  from  the  semantics  of  LTL(C)  formulas  that  every  R-free  formula  can  be 
translated  into  an  automaton  over  fi  nite  words  that  accepts  a  prefi  x  of  all  infi  nite  paths 
satisfying  the  given  formula. 

Definition  3.  Given  an  automaton  B  over  fi  nite  words  and  the  notation  as  in  Defi  ni¬ 
tion  2,  the  encoding  of  the  fc-ary  unfolding  of  B  is  given  by  Pm]j  A  acc(B)k  with  the 
acceptance  condition 

k 

acc(B)k  :=  \J  \j  pc[j]  =  f  . 

j=ofeF 

Consider  the  problem  of  fi  nding  a  counterexample  of  length  k  =  2  to  the  hypothesis  that 
our  running  example  in  Figure  2  satisfi  es  G  (x  >  0).  The  negated  property  F  (x  <  0) 
is  an  R-free  formula,  and  the  corresponding  automaton  B  over  fi  nite  words  is  displayed 
in  Figure  3  (l\  is  an  accepting  state.).  This  automaton  is  translated,  according  to  Defi  ni¬ 
tion  3,  into  the  formula 

m2  :=  I{B)ATa{B)AT1{B)f\acc{B)2.  (**) 

The  variables  pc[j\  and  x[j]  ( j  =  0, 1,  2)  are  used  to  represent  the  fi  rst  three  states  in  a 
run. 

1(B)  :=  pc[0]  =  l0 

T0(B)  :=  (pc.[ 0]  =  (0  Ax[0]  >  0  Apc[l]  =  l0 )  ®  (pc[ 0]  =  (0  Ax[0]  <  0Apc[l]  =  l±) 
Tx(B)  :=  (pc[  1]  =  Z0  A tr [1]  >  0Apc[2]  =  l0)  ®  (pc[l]  =  !0Aj[1]  <  0Apc[2]  =  Zi) 
acc(B)2  :=  pc[ 0]  =  h  V  pc[  1]  =  l±  V  pc[ 2]  =  lx 
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The  bounded  model  checking  problem  [simp/e]2  A  |6]2  for  the  simple  program  is 
obtained  by  conjoining  the  formulas  (*)  and  (**).  Altogether,  we  obtain  the  coun¬ 
terexample  (0,  l)  — >  (to,  -i l)  — >  (—1,  l)  of  length  2  for  the  property  G  (x  >  0). 

Theorem  2  (Soundness).  Let  M  £  Prg(C)  and  p  £  LTL(C).  If  there  exists  a  natural 
number  k  such  that  [M,  p\k  is  satisfi  able,  then  M  \f=  p. 

Proof  sketch.  If  [M,  p\k  is  satisfi  able,  then  so  are  \B\  and  [M]fc.  From  the  satisfi  a- 
bility  of  \B\k  it  follows  that  there  exists  a  path  in  the  Biichi  automaton  B  that  accepts 
the  negation  of  the  formula  p. 

In  general,  BMC  over  infi  nite-state  systems  is  not  complete.  Consider,  for  example, 
the  model  checking  problem  M  \=  p  for  the  program  M  =  { I,T )  over  the  variable 
V  =  {x}  with  I  =  (x  =  0)  and  T  =  (x'  =  x  +  1)  and  the  formula  ip  =  F  (x  <  0).  M 
can  be  seen  as  a  one-counter  automaton,  where  initially  the  value  of  the  counter  x  is  0, 
and  in  every  transition  the  value  of  x  is  incremented  by  1.  Obviously,  it  is  the  case  that 
M  ^  p,  but  there  exists  no  k  £  IN  such  that  the  formula  [ M,  p\  k  is  satisfi  able.  Since 
-i ip  is  not  an  R-free  formula,  the  encoding  of  the  Biichi  automaton  Bk  must  contain, 
by  Definition  2,  a  finite  accepting  cycle,  described  by  pc[k]  =  pc[0]  Ax[fe]  =  x[0] 
or  pc[k\  =  pc[l]  Ax[fe]  =  x[l]  etc.  Such  a  cycle,  however,  does  not  exist,  since  the 
program  M  contains  only  one  noncycling,  infi  nite  path,  where  the  value  of  x  increases 
in  every  step,  that  is  x[i  +  1]  =  x[i]  +  1,  forall  i  >  0. 

Theorem  3  (Completeness  for  Finite  States).  Let  M  be  a  C-program  with  a  fi  nite  set 
of  reachable  states,  ip  be  an  LTL(C)  formula  ip,  and  k  be  a  given  bound;  then:  M  |= 

/  p  implies  £  IN.  [M,  is  satisfi  able. 

Proof  sketch.  If  M  <p,  then  there  is  a  path  in  M  that  falsifi  es  the  formula.  Since 
the  set  of  reachable  states  is  fi  nite,  there  is  a  fi  nite  k  such  that  [M,  <p\  is  satisfi  able  by 
construction. 

For  a  U-free  formula  ip,  the  negation  -up  is  R-free  and  can  be  encoded  in  terms  of  an 
automaton  over  fi  nite  words.  Therefore,  by  considering  only  U-free  properties  one  gets 
completeness  also  for  programs  with  an  infi  nite  set  of  reachable  states.  A  particularly 
interesting  class  of  U-free  formulas  are  invariant  properties. 

Theorem  4  (Completeness  for  Syntactic  Safety  Formulas).  Let  M  be  a  C-program , 
•p  £  LTL(C)  be  a  U-free  property,  and  k  be  some  given  integer  bound.  Then  M  |= 

/  p  implies  3k  £  IN.  [M,  <p]fc  is  satisfi  able. 

Proof  sketch.  If  M  j/  p  and  p  is  U-free  then  there  is  a  fi  nite  prefi  x  of  a  path  of  M 
that  falsifi  es  p.  Thus,  by  construction  of  [M,  p\.  there  is  a  fi  nite  k  such  that  [M,  p\ 
is  satisfi  able. 

This  completeness  result  can  easily  be  generalized  to  all  safety  properties  [15]  by  ob¬ 
serving  that  the  prefi  xes  violated  by  these  properties  can  also  be  accepted  by  an  automa¬ 
ton  on  fi  nite  words. 

5  Examples 

We  demonstrate  BMC  over  clock  constraints  and  the  theory  of  bitvectors  by  means  of 
some  simple  but,  we  think,  illustrative  examples. 
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The  timed  automaton  [1]  in  Figure  4  has  two  real-valued  clocks  x,  y,  the  transi¬ 
tions  are  decorated  with  clock  constraints  and  clock  resets,  and  the  invariant  y  <  1  in 
location  l0  specifi  es  that  the  system  may  stay  in  ^  only  as  long  as  the  value  of  y  does 
not  exceed  1.  The  transitions  can  easily  be  described  in  terms  of  a  program  with  linear 
arithmetic  constraints  over  states  (pc,  x,  y),  where  pc  is  interpreted  over  the  set  of  lo¬ 
cations  {Zo ,  I- 1  •  h\  and  the  clock  variables  x,  y  are  interpreted  over  Mq.  Here  we  show 
only  the  encoding  of  the  time  delay  steps. 

dday(pc,x,y,pc',x',y')  := 

3  5  >  0.  ((pc  =  l0  =>  y1  <  1)  A  (x1  =  x  +  6)  A  (y1  =  y  +  5)  A  (pc'  =  pc)). 

This  relation  can  easily  be  transformed  into  an  equivalent  quantifi  er-free  formula.  Now, 
assume  the  goal  of  falsifying  the  hypothesis  that  the  timed  automaton  in  Figure  4  satis- 
fi  es  the  LTL(C)  property  ip  =  (  G  -i&),  that  is,  the  automaton  never  reaches  location  l>. 
Using  the  BMC  procedure  over  linear  arithmetic  constraints  one  finds  the  counterex¬ 
ample 

(l0,x  =  0,y  =  0)  -»  (lux  =  0,y  =  0)  ->  (l2,x  =0,y  =  0) 

of  length  2.  By  using  Skolemization  of  the  delay  step  5  instead  of  quantifi  er  elimination, 
explicit  constraints  are  synthesized  for  the  corresponding  delay  steps  in  countertraces. 

Now,  we  examine  BMC  over  a  theory  B  of  bitvectors  by  encoding  the  shift  register 
example  in  [3]  as  follows. 

IfisM  :=  true  TBS(Xn ’  Sfo)  :=  (Vn  =  xn[l  :  n  -  1]  *  li) 

The  variables  xn  and  yn  are  interpreted  over  bitvectors  of  length  n,  xn  [1  :  n  —  1]  de¬ 
notes  extraction  of  bits  1  through  n  —  1,  *  denotes  concatenation,  and  0„  (1„)  is  the 
constant  bitvector  of  length  n  with  all  bits  set  to  zero  (one).  In  the  initial  state  the  con¬ 
tent  of  the  register  xn  is  arbitrary.  Given  the  LTL(i3)  property  p  =  F  (xn  =  0„)  and 
k  =  2  the  corresponding  BMC  problem  reduces  to  showing  satisfi  ability  of  the  Bool(S) 
formula 


(xi  =  £o[l  :  n  —  1]  *  li)  A  (x2  =  £i[l  :  n  —  1]  *  li)  A 
(x0  -j-  On  V  x±  ^  0n  V  x2  0n)  A  (xq  —  x2  V  X\  —  x2). 

The  variables  Xo,  x±,  x2  are  interpreted  over  bitvectors  of  size  n,  since  they  are  used 
to  represent  the  fi  rst  three  states  in  a  run  of  the  shift  register.  The  satisfi  ability  of  this 
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V2  =  o  V 

y[  ■■=  V2  + 1  ^  Vi  <  V2 

<  I - — - •\a2) - Ka3 


2/1=0  V 

,  ,  y'2  ■■=  yi  + 1  /c~\  ~ 'U/i  <  2/2L 
&1I - — — - 1&3 


yi  :=  0  yi  :=  0 

Fig.  5.  Bakery  Mutual  Exclusion  Protocol. 


formula  is  established  by  choosing  all  unit  literals  to  be  true.  Using  theory-specih  c  can¬ 
onization  (rewrite)  steps  for  the  bitvector  theory  B  [18],  we  obtain  an  equation  between 
variables  x2  and  Xq. 

x2  =  £i[l  :  n  —  1]  *  1.!  =  (x0[l  :  n  —  1]  *  1 1 ) [1  :  n  —  1]  *  li  =  x0[2  :  n  —  1]  *  12 

This  canonization  step  corresponds  to  a  symbolic  simulation  of  depth  2  of  the  syn¬ 
chronous  circuit.  Now,  in  case  the  SAT  solver  decides  the  equation  Xo  =  x2  to  be 
true,  the  bitvector  decision  procedures  are  confronted  with  solving  the  equality  Xo  = 
Xo  [2  :  n  —  1]  *  12 .  The  most  general  solution  for  Xq  is  obtained  using  the  solver  in  [  1 8] 
and,  by  simple  backsubstitution,  one  gets  a  satisfying  assignment  for  xo,  xi,  x2,  which 
serves  as  a  counterexample  for  the  assertion  that  the  shift  register  eventually  is  zero. 
The  number  of  case  splits  is  linear  in  the  bound  /.:,  and,  by  leaving  the  word  size  unin¬ 
terpreted,  our  procedure  invalidates  a  family  of  shift  registers  without  runtime  penalties. 


6  Efficiency  Issues 

The  purpose  of  the  experiments  in  this  section  is  to  identify  useful  concepts  and  tech¬ 
niques  for  obtaining  effi  cient  implementations  of  the  lazy  theorem  proving  approach. 
For  these  experiments  we  implemented  several  reh  nements  of  the  basic  lazy  theorem 
proving  algorithm  from  Section  3,  using  SAT  solvers  such  as  Chaff  [19]  and  ICS  [10] 
for  deciding  linear  arithmetic  constraints.  These  programs  either  returns  _L  in  case  the 
input  Boolean  constraint  problem  is  unsatish  able  or  an  assignment  for  the  variables. 
We  describe  some  of  our  experiments  using  the  Bakery  mutual  exclusion  protocol  (see 
Figure  5).  Usually,  the  yi  counters  are  initialized  with  0,  but  here  we  simultaneously 
consider  a  family  of  Bakery  algorithms  by  relaxing  the  condition  on  initial  values  of  the 
counters  to  yi  >  0  Ay2  >  0.  Our  experiments  represent  worst-case  scenarios  in  that  the 
corresponding  BMC  problems  are  all  unsatish  able.  Thus,  unsatish  ability  of  the  BMC 
formula  for  a  given  k  corresponds  to  a  verih  cation  of  the  mutual  exclusion  property  for 
paths  of  length  <  k. 

Initial  experiments  with  a  direct  implementation  of  the  reh  nement  algorithm  in  Fig¬ 
ure  1  clearly  show  that  this  approach  quickly  becomes  impractical.  We  identih  ed  two 
main  reasons  for  this  inefh  ciency. 

First,  for  the  interleaving  semantics  of  the  Bakery  processes,  usually  only  a  small 
subset  of  assignments  is  needed  for  establishing  satish  ability.  This  can  already  be  demon¬ 
strated  using  the  simple  example  in  Figure  2.  Suppose  a  satisfying  assignment  v  (coun¬ 
terexample)  corresponding  to  executing  the  transition  l  — >  -1 1  with  x'  =  x  +  m  in  the 
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first  step;  that  is,  [/[0]J,,  [x[l]  =  x[0]  +  m]„  and  [-iZ[l]]„  hold.  Clearly,  the  value  of 
the  literals  x[0]  >  0,  x[l]  =  x[0]  —to  —  1,  and  x[l\  =  x[0]  are  don ’t  cares,  since  they 
are  associated  with  some  other  transition.  Overly  eager  assignment  of  truth  values  to 
these  constraints  results  in  useless  search.  For  example,  if  [x[l]  =  x[0]]p  holds,  then  an 
inconsistency  is  detected,  since  m  >  0,  and  x[l]  =  x[0]  +  to  =  x[0].  Consequently,  the 
assignment  v  is  discarded  and  the  search  continues.  To  remedy  the  situation  we  analyze 
the  structure  of  the  formula  before  converting  it  to  CNF,  and  use  this  information  to 
assign  don ’t  care  values  to  literals  corresponding  to  unfi  red  transitions  in  each  step. 

Second,  the  convergence  of  the  refi  nement  process  must  be  accelerated  by  fi  nding 
concise  overapproximations  explain(C)  of  the  minimal  set  of  inconsistent  constraints 
C  corresponding  to  a  given  Boolean  assignment.  There  is  an  obvious  trade-off  between 
the  conciseness  of  this  approximation  and  the  cost  for  computing  it.  We  are  proposing 
an  algorithm  for  fi  nding  such  an  overapproximation  based  on  rerunning  the  decision 
procedures  0(m  x  n )  times,  where  to  is  some  given  upper  bound  on  the  number  of 
iterations  (see  below)  and  n  is  the  number  of  given  constraints. 

The  run  in  Figure  6  illustrates  this  procedure.  The  constraints  in  Figure  6. (a)  are  as¬ 
serted  to  ICS  from  left-to-right.  Since  ICS  detects  a  conflict  when  asserting  ye  <  0,  this 
constraint  is  in  the  minimal  inconsistent  set.  Now,  an  overapproximation  of  the  minimal 
inconsistent  sets  is  produced  by  connecting  constraints  with  common  variables  (Fig¬ 
ure  6. (a)).  This  overapproximation  is  iteratively  refi  ned  by  collecting  the  constraints  in 
an  array  as  illustrated  in  Figure  6.(b).  Confi  gurations  consist  of  triples  ( C.  I,  h),  where 
C  is  a  set  of  constraints  guaranteed  to  be  in  the  minimal  inconsistent  set,  and  the  inte¬ 
gers  l,  h  are  the  lower  and  upper  bounds  of  constraint  indices  still  under  consideration. 
The  initial  confi  guration  in  our  example  is  ( {y;  <  0},  0,  3).  In  each  refi  nement  step, 
we  maintain  the  invariant  that  C  U  {array[i\  |  l  <  i  <  h}  is  inconsistent.  Given  a 
confi  guration  ( C ,  l,  h ),  individual  constraints  of  index  between  l  and  h  are  added  to 
C  until  an  inconsistency  is  detected.  In  the  fi  rst  iteration  of  our  running  example,  we 
process  constraints  from  right-to-left,  and  an  inconsistency  is  only  detected  when  pro¬ 
cessing  ye  >  0.  The  new  confi  guration  ({jg  <  0,  ye  >  0},  1,  3)  is  obtained  by  adding 
this  constraint  to  the  set  of  constraints  already  known  to  be  in  a  minimal  inconsis¬ 
tent  set,  by  leaving  h  unchanged,  and  by  setting  l  to  the  increment  of  the  index  of  the 
new  constraint.  The  order  in  which  constraints  are  asserted  is  inverted  after  each  iter¬ 
ation.  Thus,  in  the  next  step  in  our  example,  we  successively  add  constraints  between 
1  and  3  from  left-to-right  to  the  set  {ye  <  0. ye  >  0}.  An  inconsistency  is  first  de¬ 
tected  when  asserting  ye  =  ye  to  this  set,  and  the  new  confi  guration  is  obtained  as 
({ye  <  0.  ye  >  0,  ye  =  ye},  1, 1),  since  the  lower  bound  l  is  now  left  unchanged  and 
the  upper  bound  is  set  to  the  decrement  of  the  index  of  the  constraint  for  which  the 
inconsistency  has  been  detected.  The  procedure  terminates  if  C  in  the  current  confi  g- 
uration  is  inconsistent  or  after  to  refi  nements.  In  our  example,  two  refi  nement  steps 
yield  the  minimal  inconsistent  set  {ye  >  0,  ye  =  ys-  ye  <  0}.  In  general,  the  number 
of  assertions  is  linear  in  the  number  of  constraints,  and  the  algorithm  returns  the  exact 
minimal  set  if  its  cardinality  is  less  than  or  equal  to  the  upper  bound  to  of  iterations. 

Given  these  refi  nements  to  the  satisfi  ability  algorithm  in  Figure  1,  we  implemented 
an  offline  integration  of  Chaff  with  ICS,  in  which  the  SAT  solver  and  the  decision  pro¬ 
cedures  are  treated  as  black  boxes,  and  both  procedures  are  restarted  in  each  lazy  refi  ne- 
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Fig.  6.  Trace  for  linear  time  explain  function. 


ment  step.  Table  1  includes  some  statistics  for  three  different  confi  gurations  depending 
on  whether  don ’t  care  processing  or  the  linear  explain  are  enabled.  For  each  confi  gu- 
ration,  we  list  the  total  time  (in  seconds)  and  the  number  of  conflicts  detected  by  the 
decision  procedure.  This  table  indicates  that  the  effort  of  assigning  don’t  care  values 


don’t  cares,  no  explain 

no  don’t  cares,  explain 

don’t  cares,  explain 

depth 

time 

conflcts 

time 

conflcts 

time 

conflcts 

5 

0.71 

66 

45.23 

577 

0.31 

16 

6 

2.36 

132 

83.32 

855 

0.32 

18 

7 

12.03 

340 

286.81 

1405 

1.75 

58 

8 

56.65 

710 

627.90 

1942 

2.90 

73 

9 

230.88 

1297 

1321.57 

2566 

8.00 

105 

10 

985.12 

2296 

- 

- 

15.28 

185 

15 

- 

- 

- 

- 

511.12 

646 

Table  1.  Offlne  lazy  theorem  proving  is  time  >  1800  secs). 


depending  on  the  asynchronous  nature  of  the  program  and  the  use  of  explain  functions 
signifi  candy  improves  performance. 

Recall  that  the  experiments  so  far  represent  worst-case  scenarios  in  that  the  given 
formulas  are  unsatisfi  able.  For  BMC  problems  with  counterexamples,  however,  our 
procedure  usually  converges  much  faster.  Consider,  for  example  the  mutual  exclusion 
problem  of  the  Bakery  protocol  with  a  guard  t/i  >  t/2  —  1  instead  of  —1(2/1  <  y->)-  The 
corresponding  counterexample  for  k  =  5  is  produced  in  a  fraction  of  a  second  after 
eight  refi  nements. 

(01,  h,b1,k2)  — >  (ct2;  1  +  &2j  bit .£2)  — >  (<Z3j  1  +  &2j  &1j  £2)  — 

(03,  1  +  &2i  &2;  2  +  ^2)  — f  (<2-3 3  1  +  fe;  ^3l  2  +  fe) 

This  counterexample  actually  represents  a  family  of  traces,  since  it  is  parameterized 
by  the  constants  ki  and  k-> ,  with  k  1 .  fc2  >  0,  which  have  been  introduced  by  the  ICS 
decision  procedures. 

In  the  case  of  lazy  theorem  proving,  the  offline  integration  is  particular  expensive, 
since  restarts  implies  the  reconstruction  of  ICS  logical  contexts  repetitively.  Memoiza- 
tion  of  the  decision  procedure  calls  does  not  improve  the  situation  signifi  candy,  since 
the  assignments  produced  by  Chaff  in  subsequent  calls  usually  do  not  have  long  enough 
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no  explain 

explain 

depth 

time 

conflcts 

calls  to  ICS 

time 

conflcts 

.-alls  to  ICS 

5 

0.03 

24 

162 

0.01 

7 

71 

6 

0.08 

48 

348 

0.01 

7 

83 

7 

0.19 

96 

744 

0.02 

7 

94 

8 

0.98 

420 

3426 

0.05 

29 

461 

9 

2.78 

936 

7936 

0.19 

70 

1205 

10 

8.60 

2008 

17567 

0.26 

85 

1543 

15 

- 

- 

- 

4.07 

530 

13468 

Table  2.  Online  lazy  theorem  proving. 


common  prefi  xes.  This  observation,  however,  might  not  be  generalizable,  since  it  de¬ 
pends  on  the  specifi  c,  randomized  heuristics  of  Chaff  for  choosing  variable  assignments. 

In  an  online  integration,  choices  for  propositional  variable  assignments  are  syn¬ 
chronized  with  extending  the  logical  context  of  the  decision  procedures  with  the  cor¬ 
responding  atoms.  Detection  of  inconsistencies  in  the  logical  context  of  the  decision 
procedures  triggers  backtracking  in  the  search  for  variable  assignments.  Furthermore, 
detected  inconsistencies  are  propagated  to  the  propositional  search  engine  by  adding 
the  corresponding  inconsistency  clause  (or,  using  an  explanation  function,  a  good  over¬ 
approximation  of  the  minimally  inconsistent  set  of  atoms  in  the  logical  context).  Since 
state-of-the-art  SAT  solvers  such  as  Chaff  are  missing  the  necessary  API  for  realizing 
such  an  online  integration,  we  developed  a  homegrown  SAT  solver  which  has  most  of 
the  features  of  modem  SAT  solvers  and  integrated  it  with  ICS.  The  results  of  using  this 
online  integration  for  the  Bakery  example  can  be  found  in  Table  2  for  two  different  con- 
fi  gurations.1  For  each  confi  guration,  we  list  the  total  time  (in  seconds),  the  number  of 
conflicts  detected  by  ICS,  and  the  total  number  of  calls  to  ICS.  Altogether,  using  an  ex¬ 
planation  facility  clearly  pays  off  in  that  the  number  of  refi  nement  iterations  (conflicts) 
is  reduced  considerable. 


7  Related  Work 

There  has  been  much  recent  work  in  reducing  the  satisfi  ability  problem  of  Boolean  for¬ 
mulas  over  the  theory  of  equality  with  uninterpreted  function  symbols  to  a  SAT  prob¬ 
lem  [5, 12, 23]  using  eager  encodings  of  possible  instances  of  equality  axioms.  In  con¬ 
trast,  lazy  theorem  proving  introduces  the  semantics  of  the  formula  constraints  on  de¬ 
mand  by  analyzing  spurious  counterexamples.  Also,  our  procedure  works  uniformly  for 
much  richer  sets  of  constraint  theories.  It  would  be  interesting  experimentally  to  com¬ 
pare  the  eager  and  the  lazy  approach,  but  benchmark  suites  (e.g.  www.ece.cmu.edu/~mvelev) 
are  currently  only  available  as  encodings  of  Boolean  satisfi  ability  problems. 

In  research  that  is  most  closely  related  to  ours,  Barrett,  Dill,  and  Stump  [2]  de¬ 
scribe  an  integration  of  Chaff  with  CVC  by  abstracting  the  Boolean  constraint  formula 

1  The  differences  in  the  number  of  conflcts  compared  to  Table  1  are  due  to  the  different  heuris¬ 
tics  of  the  SAT  solvers  used. 
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to  a  propositional  approximation,  then  incrementally  refi  ning  the  approximation  based 
on  diagnosing  conflicts  using  theorem  proving,  and  fi  nally  adding  the  appropriate  con¬ 
flict  clause  to  the  propositional  approximation.  This  integration  corresponds  directly  to 
an  online  integration  in  the  lazy  theorem  paradigm.  Their  approach  to  generate  good 
explanations  is  different  from  ours  in  that  they  extend  CVC  with  a  capability  of  ab¬ 
stract  proofs  for  overapproximating  minimal  sets  of  inconsistencies.  Also,  optimiza¬ 
tions  based  on  don’t  cares  are  not  considered  in  [2].  The  experimental  results  in  [2] 
coincide  with  ours  in  that  they  suggest  that  lazy  theorem  proving  without  explanations 
(there  called  the  naive  approach)  and  offline  integration  quickly  become  impractical. 
Using  equivalence  checking  for  pipelined  microprocessors,  speedups  of  several  orders 
of  magnitude  over  their  earlier  SVC  system  are  obtained. 


8  Conclusion 


We  developed  a  bounded  model  checking  (BMC)  procedure  for  infi  nite-state  systems 
and  linear  temporal  logic  formulas  with  constraints  based  on  a  reduction  to  the  sat- 
isfi  ability  problem  of  Boolean  constraint  logic.  This  procedure  is  shown  to  be  sound, 
and  although  incomplete  in  general,  we  establish  completeness  for  invariant  formulas. 
Since  BMC  problems  are  propositionally  intensive,  we  propose  a  verifi  cation  technique 
based  on  a  lazy  combination  of  a  SAT  solver  with  a  constraint  solver,  which  introduces 
only  the  portion  of  the  semantics  of  constraints  that  is  relevant  for  constructing  a  BMC 
counterexample. 

We  identifi  ed  a  number  of  concepts  necessary  for  obtaining  effi  cient  implementa¬ 
tions  of  lazy  theorem  proving.  The  fi  rst  one  is  specialized  to  BMC  for  asynchronous 
systems  in  that  we  generate  partial  Boolean  assignments  based  on  the  structure  of  pro¬ 
gram  for  restricting  the  search  space  of  the  SAT  solver.  Second,  good  approximations  of 
minimal  inconsistent  sets  of  constraints  at  reasonable  cost  are  essential.  The  proposed 
any-time  algorithm  uses  a  mixture  of  structural  dependencies  between  constraints  and 
a  linear  number  of  reruns  of  the  decision  procedure  for  refi  ning  overapproximations. 
Third,  offline  integration  and  restarting  the  SAT  solver  results  in  repetitive  work  for  the 
decision  procedures.  Based  on  these  observations  we  realized  a  lazy,  online  integration 
in  which  the  construction  of  partial  assignments  in  the  Boolean  domain  is  synchro¬ 
nized  with  the  construction  of  a  corresponding  logical  context  for  the  constraint  solver, 
and  inconsistencies  detected  by  the  constraint  solver  are  immediately  propagated  to 
the  Boolean  domain.  First  experimental  results  are  very  promising,  and  many  standard 
engineering  can  be  applied  to  signifi  candy  improve  running  times. 

We  barely  scratched  the  surface  of  possible  applications.  Given  the  rich  set  of  possi¬ 
ble  constraints,  including  constraints  over  uninterpreted  function  symbols,  for  example, 
our  extended  BMC  methods  seems  to  be  suitable  for  model  checking  open  systems, 
where  environments  are  only  partially  specifi  ed.  Also,  it  remains  to  be  seen  if  BMC 
based  on  lazy  theorem  proving  is  a  viable  alternative  to  specialized  model  checking 
algorithms  such  as  the  ones  for  timed  automata  and  extensions  thereof  for  fi  nding  bugs, 
or  even  to  AI  planners  dealing  with  resource  constraints  and  domain-specifi  c  modeling. 
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Abstract.  We  investigate  the  combination  of  propositional  SAT  checkers  with  constraint 
solvers  for  domain-specific  theories  such  as  linear  arithmetic,  arrays,  lists  and  the  combination 
thereof.  Our  procedure  realizes  a  lazy  approach  to  satisfiability  checking  of  propositional 
constraint  formulas  by  iteratively  refining  Boolean  formulas  based  on  lemmas  generated  on 
demand  by  constraint  solvers. 


1.  Introduction 

Many  search  and  optimization  problems  can  effectively  be  solved  using  propo¬ 
sitional  reasoning  techniques.  Finiteness,  however,  is  an  inherent  restriction 
of  propositional  encodings,  and  computational  systems  and  environment  mod¬ 
els  are  usually  expressed  more  succinctly  in  logics  enriched  with  domain- 
specifi  c  constraints.  Planning  problems  in  AI,  for  example,  may  involve  solv¬ 
ing  numeric  resource  constraints,  and  program  analyses  often  require  reason¬ 
ing  about  constraints  in  the  combination  of  datatypes  such  as  integers,  arrays, 
lists,  or  bit  vectors. 

Given  a  decidable  constraint  theory,  we  address  the  problem  of  construct¬ 
ing  effective  solutions  to  the  satisfi  ability  problem  for  propositional  combi¬ 
nations  of  constraints.  Of  course,  satisfi  ability  solvers  for  propositional  con¬ 
straint  formulas  can  easily  be  obtained  from  the  combination  of  a  proposi¬ 
tional  SAT  solver  with  decision  procedures  simply  by  converting  the  prob¬ 
lem  into  disjunctive  normal  form,  but  the  resulting  algorithm  is  usually  pro¬ 
hibitively  expensive.  Alternatively,  propositional  search  capabilities  can  be 
added  to  theorem  provers,  but  it  seems  to  be  more  effective  to  augment  propo¬ 
sitional  SAT  solvers  with  theorem  proving  capabilities. 

Flere  we  look  at  the  specifi  c  combination  of  SAT  solvers  with  constraint 
solvers,  and  we  propose  a  method  that  we  call  lemmas  on  demand,  which 
invokes  the  constraint  solver  lazily  in  order  to  effi  ciently  prune  out  spurious 
counterexamples,  namely,  counterexamples  that  are  generated  by  the  SAT 
solver  but  discarded  by  the  theorem  prover  by  interpreting  the  propositional 

*  This  research  was  supported  by  SRI  International  internal  research  and  development,  the 
DARPA  NEST  program  through  Contract  F33615-01-C-1908  with  AFRL,  and  the  National 
Science  Foundation  under  grants  CCR-00-86096  and  CCR-0082560. 
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atoms.  For  example,  the  SAT  solver  might  yield  the  satisfying  assignment 
p,  -i q,  where  the  propositional  variable  p  represents  the  atom  x  —  y,  and 
q  represents  f(x)  =  f(y).  A  decision  procedure  can  easily  detect  the  in¬ 
consistency  in  this  assignment.  More  importantly,  it  can  be  used  to  gener¬ 
ate  a  set  of  conflicting  assignments  that  can  be  used  to  construct  a  lemma 
that  further  constrains  the  search.  In  the  above  example,  the  lemma  -<p  V  q 
can  be  added  as  a  new  clause  in  the  input  to  the  SAT  solver.  This  process 
of  refi  ning  Boolean  formulas  is  similar  in  spirit  to  the  refi  nement  of  ab¬ 
stractions  based  on  the  analysis  of  spurious  counterexamples  or  failed  proof 
attempts  [26,  25,  6,  16,  8,  14,  18]. 

From  a  set  of  inconsistent  constraints  in  a  spurious  counterexample  we 
obtain  an  explanation  as  an  over-approximation  of  the  minimal,  inconsistent 
subset  of  these  constraints.  The  smaller  the  explanation  that  is  generated  from 
a  spurious  counterexample,  the  greater  the  pruning  in  the  subsequent  search. 
In  this  way,  the  computation  of  explanations  accelerates  the  convergence  of 
our  procedure. 

The  paper  is  structured  as  follows.  Section  2  includes  some  background 
material,  whereas  Section  3  describes  the  lemmas  on  demand  approach  and 
various  refi  nements  thereof.  Initial  experience  with  this  technique  is  reported 
in  Section  4.  Finally,  in  Section  6  we  draw  conclusions. 


2.  Background 

We  use  the  familiar  concepts  and  notations  of  propositional  logic  and  con¬ 
straint  logic.  The  truth  values  true ,  false  are  assigned  to  propositional  vari¬ 
ables.  A  literal  is  a  propositional  variable  or  its  negation,  a  clause  c  is  a 
disjunction  of  literals,  and  a  CNF  formula  is  a  conjunction  of  clauses.  There  is 
a  linear-time  satisfi  ability-preserving  transformation  into  CNF  [22].  A  propo¬ 
sitional  SAT  solver  ( B-sat )  is,  for  our  purposes,  a  computable  function  that 
receives  a  CNF  formula  and  returns  either  a  satisfying  truth  assignment  or 
unsatisfiable  if  such  an  assignment  does  not  exist. 

A  (conjunctive)  constraint  solver,  say  C-sat,  for  a  constraint  theory  C,  is  a 
computable  function  that  checks  whether  or  not  a  set  of  constraints  in  a  theory 
C  is  satisfi  able.  For  instance,  a  linear  programming  system  is  a  constraint 
solver  for  linear  arithmetic. 

Given  a  constraint  theory  C,  the  set  of  Boolean  constraints  Bool(C)  in¬ 
cludes  all  constraints  in  C  and  it  is  closed  under  conjunction  A  ,  disjunction 
V  ,  implication  — > ,  and  negation  -i.  The  notions  of  satisfi  ability,  inconsis¬ 
tency,  satisfying  assignment,  and  satisfi  ability  solver  arc  lifted  to  the  set  of 
Boolean  constraints  in  the  usual  way. 

Formulas  in  Bool(C)  can  be  translated  into  equisatisfi  able  Boolean  for¬ 
mulas  as  long  as  the  consistency  of  sets  of  constraints  in  C  is  decidable. 
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Translation  schemes  between  propositional  formulas  and  Boolean  constraint 
formulas  arc  needed.  Given  a  formula  <p  such  a  correspondence  is  easily  ob¬ 
tained  by  abstracting  constraints  in  ip  with  (fresh)  propositional  variables.  Let 
a  be  a  function  that  maps  constraints  in  C  to  propositional  variables.  This 
mapping  induces  a  mapping  from  boolean  constraint  formulas  to  proposi¬ 
tional  formulas.  For  example,  the  formula  ip  =  xq  >  OAaq  =  xo  +  1  — > 
x\  >  1  over  lineal-  arithmetic  is  mapped  to  a(ip)  =  p\  Ap-j  — >  p%,  where 
a(x o  >  0)  i-T  pi,  a(x i  =  xo  +  1)  P2 ,  and  a(x i  >  1)  >-7  p 3.  More¬ 
over,  an  assignment  v  for  propositional  variables  induces  a  set  of  constraints. 
Thus,  let  7  be  the  function  that  performs  such  mapping.  For  instance,  the 
assignment  v  =  {p\  1-7  false,  p-2  >-7  true . p3  1-7  false}  induces  the  set 
7 (12)  =  {xo  <  0,  xi  =  xo  +  1,  x\  <  1}.  Now,  it  is  easy  to  see  that  a  CNF 
formula  <p  in  Bool(C)  is  equisatisfi  able  with  the  Boolean  formula  (in  CNF) 

«(</>)  A  (  A  V  •  •  •  V 

where  I(<p)  is  the  set  of  subsets  ln  }  of  literals  l,  in  a(<p)  such  that  its 

“interpretation”  (7((i), . . . ,  7 (ln)}  is  inconsistent  in  C  .  Thus,  every  Bool(C) 
formula  can  be  transformed  into  an  equisatisfi  able  Boolean  formula  as  long  as 
there  is  a  constraint  solver  for  C.  On  the  other  hand,  the  reduction  seems  to  be 
infeasible,  since  an  exponential  number  of  C-inconsistency  checks  is  required 
in  the  worst  case.  It  has  been  observed,  however,  that  in  many  practical  cases 
only  small  fragments  of  the  set  of  C-inconsistencies  is  needed.  The  main  prob¬ 
lem  here  is  to  identify  small  subsets  of  the  set  of  all  C-inconsistencies  which 
are  suffi  cient  to  establish  satisfi  ability  of  the  Boolean  constraint  formula  at 
hand. 


3.  Lemmas  on  Demand 

We  propose  an  algorithm  based  on  the  refi  nement  of  Boolean  formulas  with 
inconsistency  lemmas  that  are  generated  on  demand.  We  restrict  ourselves  to 
formulas  in  CNF,  since  most  Boolean  SAT  solvers  expect  their  input  to  be  in 
this  format. 

The  procedure  sat  (^5)  in  Figure  1  combines  a  Boolean  SAT  solver  B-sat 
and  a  domain-specifi  c  constraint  solver  C-sat.  B-sat  generates  a  candidate 
Boolean  assignment  for  a(<p).  If  there  is  no  such  candidate,  the  algorithm 
terminates,  since  ip  is  clearly  unsatisfi  able.  Otherwise  the  satisfi  ability  solver 
C-sat  is  used  to  check  whether  or  not  the  Boolean  assignment  v  determines 
a  valid  assignment  for  ip.  If  the  assignment  is  not  valid,  new  propositional 
clauses  ( inconsistency  lemmas )  are  added  to  the  propositional  formula  at 
hand.  The  procedure  refine  is  crucial  in  that  it  generates  such  new  clauses. 
In  order  to  guarantee  soundness,  all  (interpretations  of)  clauses  returned  by 
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procedure  sat(<p) 

P  ■=  a{tp)-, 

loop 

v  :=  B-sat{p)\ 

if  v  =  unsatisfiable  then  return  unsatisfiable 
else  if  C-sat{ 7(1'))  then  return  satisfiable 
else  p  :=p  U  refine(v) 


Figure  1.  Lemmas  on  Demand  for  Bool(C). 

refine  are  assumed  to  be  implied  by  ip.  In  addition,  the  algorithm  is  complete 
if  at  least  one  clause  returned  by  refine  is  not  subsumed  by  clauses  already  in 
p.  Alternatively,  completeness  can  also  be  achieved  by  disabling  inti  nite  loops 
in  which  refine  is  only  adding  clauses  subsumed  by  clauses  already  in  p.  We 
will  return  to  a  discussion  about  specifi  c  implementations  of  refine  functions 
in  Section  3.2. 

3.1.  Constraint  theories:  Examples 

One  advantage  of  our  approach  is  that  it  works  uniformly  for  a  large  class 
of  constraint  theories,  since  the  main  requirement  on  these  theories  is  the 
decidability  of  the  conjunctive  satisfi  ability  problem.  We  review  some  of  the 
more  important  constraints  theories  with  polynomial  satisfi  ability  problem 
for  the  conjunction  of  a  constraints.  It  follows  that  the  satisfi  ability  problem 
for  the  corresponding  Boolean  constraint  theories  are  all  NP-complete.  In  the 
following  we  assume  as  given  a  countably  inti  nite  set  V  of  variables,  and 
conjunctions  of  constraints  are  represented  by  fi  nite  sets. 

3.1.1.  Equality  for  Constants. 

Satisfi  ability  of  conjunctive  constraints  C  consisting  of  equalities  x  —  y  and 
disequalities  x  /  y  for  variables  x,  y  can  be  decided  in  linear  time  in  the  size 
of  C.  First,  a  graph  is  built,  where  the  nodes  are  the  variables  and  there  is  an 
edge  between  nodes  x  and  y  iff  C  contains  the  equality  x  —  y.  Now,  C  is 
satisfi  able  iff  for  all  u  fi  v  in  C  it  is  the  case  that  u  and  v  are  not  connected 
in  this  graph. 

3.1.2.  Equality  for  Uninterpreted  Functions. 

Terms  arc  either  variables  or  applications  /(/.]  . ....  tn).  where  /  is  a  function 
symbol  in  some  given  signature  of  arity  n.  Satisfi  ability  for  a  conjunction 
of  equations  and  disequations  over  terms  is  decidable  in  0(n  log(n))  using 
congruence  closure  [11].  Satisfiability  procedures  for  theories  such  as  the 
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one  for  cons ,  car,  and  cdr  can  be  obtained  using  congruence  closure  algo¬ 
rithms  [20]  by  adding  all  relevant  instances  of  universally  quantifi  ed  axiom 
schemes  such  as  x  =  car  (cons  (x,  y).  Similarly,  using  Ackermann’s  trick  [1] 
or  a  variation  thereof,  one  can  transform  Boolean  constraints  over  equalities 
for  uninterpreted  terms  to  an  equisatisfi  able  Boolean  problem  with  equations 
over  variables  as  literals  by  adding  all  possible  instances  of  the  congruence 
axiom  and  renaming  uninterpreted  subterms  with  variables.  In  the  worst  case, 
the  number  of  such  axioms  is  proportional  to  the  square  of  the  length  of  the 
given  formula. 

3.1.3.  Theories  of  Arithmetic. 

Lineal-  arithmetic  constraints  are  built  up  from  inequalities  over  linear  arith¬ 
metic  terms  including  rational  constants  and  addition.  When  interpreted  over 
the  rationals,  the  conjunctive  satisfi  ability  problem  for  linear  arithmetic  con¬ 
straints  is  polynomial,  since  it  is  equivalent  to  the  linear  programming  prob¬ 
lem,  which  is  known  to  be  polynomial;  when  interpreting  linear  arithmetic 
terms  over  the  integers,  the  problem  becomes  NP-complete  over  the  integers. 
The  conjunctive  satisfi  ability  problem  for  nonlinear  arithmetic  constraints, 
which  include  also  multiplication,  is  still  decidable  when  interpreted  over 
the  rationals,  but  becomes  undecidable  over  the  integers.  Pratt  observed  that 
most  inequalities  in  program  verifi  cation  are  of  the  form  x  —  y  <  c,  where 
c  is  constant.  Think  of  a  conjunction  C  of  these  constraints  representing  a 
directed  graph  whose  nodes  are  labelled  with  variables  and  there  is  an  edge 
from  x  to  y  of  weight  c  for  each  constraint  x  —  y  <  c.  Now,  C  is  satisfi  able  iff 
there  exists  a  negative-weight  cycle  in  this  graph.  Using  then  Bellman-Ford 
algorithm,  satisfi  ability  of  C  is  decided  in  time  quadratic  to  the  number  of 
variables  in  C.  Shostak’s  [28]  loop  residue  algorithm  for  linear  constraints 
a  *  x  +  b  *  y  <  c  reduces  to  Pratt's  algorithm  when  applied  to  difference 
constraints. 


3.1.4.  Theory  of  Fixed-Sized  Bitvectors. 

A  core  theory  of  equalities  over  fixed-sized  bitvectors  includes  variables  xn, 
which  are  interpreted  over  bitvectors  of  width  n,  extraction  xn  [ i  :  j]  of  bits  i 
through  j,  and  concatenation  of  two  bitvector  terms.  From  the  results  in  [7]  it 
follows  that  the  conjunctive  satisfi  ability  problem  for  this  theory  is  decidable 
in  polynomial  time  when  the  width  of  variables  and  extraction  positions  are 
integer  constants.  These  problems  can  easily  be  translated  to  equisatisfi  able 
propositional  SAT  problems  by  bitwise  splitting  of  the  bitvector  constraints. 
In  practice,  however,  considerable  performance  gains  have  been  reported  by 
using  domain-specific  bitvector  procedures  instead  of  SAT  solvers  [15].  For 
example,  a  bitvector  encoding  of  the  shift  register  BMC  benchmark  is  expo¬ 
nentially  more  succinct  than  the  corresponding  Boolean  formula  [10]. 
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3.1.5.  Combination  of  Satisfiability  Procedures. 

Many  verifi  cation  problems  require  to  solve  constraint  problems  in  the  union 
of  constraint  theories.  There  arc  two  basic  paradigms  for  combining  decision 
procedures.  The  Nelson-Oppen  [21]  method  combines  decision  procedures 
for  disjoint  theories  by  exchanging  equality  information  on  the  shared  vari¬ 
ables.  If  the  constituent  decision  procedures  arc  polynomial,  then  the  com¬ 
bined  Nelson-Oppen  procedure  is  polynomial,  too.  In  Shostak’s  method  [29, 
24,  27]  the  combination  of  the  theory  of  pure  equality  with  canonizable  and 
solvable  theories  is  decided  through  an  extension  of  congruence  closure  that 
yields  a  canonizer  for  the  combined  theory.  Again,  if  the  constituent  canon- 
izers  and  solvers  arc  polynomial-time,  then  Shostak’s  algorithm  also  runs  in 
polynomial  time.  All  of  the  individual  theories  listed  above  can  be  combined 
using  either  the  Nelson-Oppen  or  the  Shostak  approach.  Consequently,  sat- 
isfi  ability  for  propositional  logic  with  constraints  in  the  combination  of  any 
subset  of  these  theories  is  NP-complete. 

3.2.  Refinements 

Now  we  describe  some  possible  implementations  of  the  refine  function  in 
Figure  1.  A  simple  implementation  of  refine  creates  clauses  of  increasing  size 
in  each  iteration.  For  example,  if  a(x o  >  1)  >-T  p\,  a(x o  >  0)  t-p-  p2, 
a(x i  =  rco)  '-T  P3,  ot(x i  >  1)  i-T  Pi,  a(x i  >  0)  i-T  ps,  the  first  call 
to  refine  produces  the  clauses  -pi  Vp2,  and  -p4  V  p?} ,  the  second  one  pro¬ 
duces  the  clauses  -pi  V  -p3  Vp4,  -pi  V  -p3  Vp5,  and  so  on.  This  unguided 
enumeration  is  a  sound  and  complete  procedure,  but  it  is  usually  infeasible 
in  practice,  since  the  number  of  clauses  of  size  k  is  0(nk),  where  n  is  the 
number  of  constraints. 

Alternatively,  clauses  are  added  in  a  guided  way  based  on  the  analysis  of 
the  set  of  constraints  corresponding  to  a  Boolean  assignment.  For  instance,  if 
the  Boolean  assignment  v  —  {pi  t-p-  true, p-2  false, p%  false}  has  been 
tested  to  yield  an  inconsistent  set  of  constraints,  the  procedure  refine  adds 
the  clause  -pi  Vp2  Vp3-  This  clause  clearly  prevents  the  invalid  assignment 
to  be  regenerated  by  B-sat.  Therefore,  the  procedure  of  iteratively  refi  ning  a 
Boolean  formula  based  on  the  newly  detected  inconsistencies  is  terminating 
and  complete.  However,  a  naive  implementation  is  also  ineffi  dent  in  practice, 
since  only  small  fragments  of  the  assignment  v  are  inconsistent.  For  example, 
suppose  that  an  invalid  assignment  is  associated  with  the  following  set  of 
constraints: 

{xq  >  0,  y0  >  0,  xi  =  xo,  yi  =  yo+1,  x2  =  xi+1,  y2  =  yi,x2  >l,x2<  1} 

It  is  cleai-  that  {x2  >  1,  x2  <  1}  or  {xo  >0,x\  =  xq ,  x2  =  x\  +  1,  x2  < 
1}  are  sufficient  to  describe  the  conflict.  Therefore,  let  us  assume  that  there 
is  a  function  explain  that  returns  an  over-approximation  of  the  minimal  set 
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of  constraints  that  implies  the  inconsistency  detected  by  C-sat.  This  function 
is  similar  to  the  conflict  resolution  procedures  found  in  Boolean  SAT  solvers 
such  as  GRASP  [17]  or  Chaff  [19].  Abstractly,  conflict  resolution  procedures 
in  Boolean  SAT  solver  can  be  seen  as  a  function  that  receives  a  conflicting 
clause  1  and  returns  a  new  clause  that  prevents  this  specifi  c  conflict  in  future 
iterations.  These  new  clauses  arc  called  conflict  clause,  and  the  process  of 
constructing  them  is  sometimes  referred  to  as  learning.  There  is  an  obvi¬ 
ous  trade-off  between  the  conciseness  of  this  approximation  and  the  cost 
for  computing  it.  We  arc  proposing  an  algorithm  for  fi  nding  such  an  over¬ 
approximation  based  on  rerunning  the  constraint  solver  0(m  x  n )  times, 
where  m  is  some  given  upper  bound  on  the  number  of  iterations  (see  below) 
and  n  is  the  number  of  given  constraints. 

The  run  in  Figure  2  illustrates  this  procedure.  The  constraints  in  Fig¬ 
ure  2.(a)  arc  asserted  to  C-sat  from  left-to-right.  Since  C-sat  detects  a  conflict 
when  asserting  y6  <  0,  this  constraint  is  in  the  minimal  inconsistent  set. 
Now,  an  over-approximation  of  the  minimal  inconsistent  sets  is  produced 
by  connecting  constraints  with  common  variables  (Figure  2. (a)).  This  over¬ 
approximation  is  iteratively  refi  ned  by  collecting  the  constraints  in  an  array  as 
illustrated  in  Figure  2.(b).  Conti  gurations  consist  of  triples  (C.  /,  h),  where  C 
is  a  set  of  constraints  guaranteed  to  be  in  the  minimal  inconsistent  set,  and  the 
integers  l,  h  are  the  lower  and  upper  bounds  of  constraint  indices  still  under 
consideration.  The  initial  confi  guration  in  our  example  is  ( {  <  0},  0,  3).  In 

each  refi  nement  step  we  maintain  the  invariant  that  C U  {array [i\  \  l  <  i  <  h} 
is  inconsistent.  Given  a  confi  guration  (C,  I,  h),  individual  constraints  of  index 
between  l  and  h  are  added  to  C  until  an  inconsistency  is  detected.  In  the  fi  rst 
iteration  of  our  running  example  we  process  constraints  from  right-to-left, 
and  an  inconsistency  is  only  detected  when  processing  y$  >  0.  The  new 
confi  guration  ({?/;  <  0.  y$  >  0},  1,  3)  is  obtained  by  adding  this  constraint 
to  the  set  of  constraints  already  known  to  be  in  a  minimal  inconsistent  set, 
by  leaving  h  unchanged,  and  by  setting  /  to  the  increment  of  the  index  of  the 
new  constraint.  The  order  in  which  constraints  are  asserted  is  inverted  after 
each  iteration.  Thus,  in  the  next  step  in  our  example,  we  successively  add 
constraints  between  1  and  3  from  left-to-right  to  the  set  {y6  <  0,  yr,  >  0}. 
An  inconsistency  is  first  detected  when  asserting  =  y5  to  this  set,  and 
the  new  configuration  is  obtained  as  ({y$  <  0, ys  >  0, y§  =  ys},  1,1), 
since  the  lower  bound  l  is  now  left  unchanged  and  the  upper  bound  is  set 
to  the  decrement  of  the  index  of  the  constraint  for  which  the  inconsistency 
has  been  detected.  The  procedure  terminates  if  C  in  the  current  confi  guration 
is  inconsistent  or  after  m  refi  nements.  In  our  example,  two  refi  nement  steps 
yield  the  minimal  inconsistent  set  {y$  >  0,  yc,  =  2/5,  y^  <  0}.  In  general,  the 
number  of  assertions  is  linear  in  the  number  of  constraints,  and  the  algorithm 


1  A  confiding  clause  is  a  clause  in  which  all  literals  are  assigned  to  false. 
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Figure  2.  Trace  for  linear  time  explain  function. 

procedure  collect(f,  u) 
if  /  =  ci  V  ...  V  cn  then 

if  /  G  7(1/)  then  return  collect(choose({ct  \  c,  G  7(1')},  u) 
else  return  Uie[i,n]  collect  {ci,  u) 
if  /  =  ci  47  C2  then 
return  collect(c  1,  u)  U  col  led  (02-  v) 

if  /  =  -ic  then 
return  collect  (c,  i/) 
if  is-constraint(/)  then 

if  /  G  7(1/)  then  return  {/}  else  return  {— 1/} 
return  0 


Figure  3.  Collecting  relevant  constraints. 

returns  the  exact  minimal  set  if  its  cardinality  is  less  than  or  equal  to  the  upper 
bound  m  of  iterations. 

An  additional  refi  nement  can  be  introduced  in  the  procedure  sat(<p),  since, 
in  most  cases,  for  a  given  assignment  u  only  a  small  subset  of  7(1/)  need  to  be 
considered.  Overly  eager  assignments  result  in  both  useless  search  and  overly 
specifi  c  counterexamples.  For  instance,  assume  the  formula  (q  Api)  V  (~>q  AP2), 
and  the  assignment  v  —  {q  (->■  true.pi  (->■  true . p2  >-7  true},  suppose  the 
following  two  situations: 

1.  a(pi)  H7  x  >  0,  and  a(p 2)  1-7  x  =  —1,  C-sat( 7(1/))  returns  unsatisfi  able, 
since  {x  >  0,  x  =  —1}  is  inconsistent.  Therefore  the  assignment  v 
is  discarded  and  the  search  continues.  However,  constraint  p-2  is  clearly 
irrelevant,  that  is,  it  is  a  don ’t  care. 

2.  a(pi)  !->■  x  >  0,  and  a(p 2)  1-7  x  <  1,  C-sat{ 7(1/))  returns  satisfiable. 
However,  the  resulting  set  of  models  is  overly  specifi  c  in  that  the  value  of 
x  is  restricted  to  those  in  the  interval  [0, 1], 

To  solve  this  problem  we  keep  the  structure  of  the  formula  before  CNF 
translation.  The  structure  of  formula  is  used  to  decide  whether  a  constraint 
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procedure  refine- Kv)  return  {  clausify('y(v ))  } 
procedure  refine-2(u )  return  {  clans ify( explain! 7  ( zz )))  } 
procedure  refine-3(u)  return  {  clausify{collect(ip,  v))  } 
procedure  refine -4(  u)  return  {  clausify(explain(collect(<p,  v)))  } 
procedure  clausifyfC )  return  {«(/)  |  3/'  £  CAi  =  ->/'} 


Figure  4.  Refinement  functions. 


is  relevant  in  a  given  assignment  or  not.  The  procedure  collect  (/,  u)  in  Fig¬ 
ure  3  collects  all  relevant  constraints  for  a  formula  /  and  an  assignment  u. 
For  simplicity,  this  procedure  only  considers  the  propositional  connectives: 

V,  and  -1.  The  CNF  translation  adds  a  new  propositional  variable  for 
each  non-atomic  sub-formula.  It  is  important  to  notice  that,  /  £  7 (a)  iff 
n(a(J'))  =  true,  that  is,  the  formula  /  is  assigned  to  true  in  the  assignment  u. 
The  function  is-constraint(f)  returns  true,  if  /  is  a  constraint.  For  instance, 
the  formula  (q  Ap\)  V  (~iq  Ap2)  is  represented  as  -i(-i q  V  -77)  V  -1  (q  V  -ip2), 
and  is  translated  to  the  following  CNF  formula: 

(ai  V  a2)  A 

(-1  q\/  -ipi  V  ai)  A  (-iai  \J  q)  A  (-iai  Vpi)  A 
(q  V  ~'P’2  V  a2)  A  (-ia2  V  ->q)  A  (-ia2  Vp2) 

where,  ai  and  a2  are  auxiliary  propositional  variables,  that  is,  a  j  =  -i(-i q  V  ~<pi) 
anda2  =  -^(q  V  ^p2).  Given  an  assignment  1;  —  {q  i->  true,pi  true,p-2  •—> 
true,ai  hT  true,  a2  hT  false},  it  is  clear  that  collect^/,  v)  =  {q,pi},  that 
is,  the  value  of  p2  is  a  don ’t  care. 

Figure  4  summarizes  the  guided  refinement  procedures  discussed  above. 
The  procedure  refine-1  implements  the  naive  approach  without  explanation 
capability  and  no  specifi  c  consideration  of  don ’t  cares.  The  procedure  clausify 
converts  a  set  of  conflicting  constraints  to  a  clause.  Procedure  refine-2  uses  the 
explanation  facility  but  no  don ’t  cares,  whereas  refine-3  uses  explanations  and 
handles  don’t  cares  by  collecting  relevant  constraints  with  collect  in  Figure  3. 
Finally,  the  procedure  refine-4  uses  all  optimizations  described  in  this  section. 

3.3.  Online  integration 

So  far,  we  described  an  offline  integration  of  B-sat  and  C-sat,  in  which  the 
solvers  are  treated  as  black  boxes,  and  both  procedures  arc  restarted  in  each 
refi  nement  step.  However,  some  C-sat  tools  support  backtracking.  In  this  case, 
an  online  integration  is  more  appropriate,  where  choices  for  propositional 
variable  assignments  are  synchronized  with  extending  the  logical  context 
of  the  C-sat  with  the  corresponding  atoms.  Detection  of  inconsistencies  in 
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procedure  dpllQ 
loop 

if  decideQ  =  done  then  return  satisfiable 

loop 

cc  :=  bcp(); 

if  cc  =  nil  then  break 

if  not  conflict- re  so  I  u  t  ion  ( cc)  then 
return  unsatisfiable 


Figure  5.  Davis-Putnam  procedure. 


the  logical  context  of  the  C-sat  triggers  backtracking  in  the  search  for  vari¬ 
able  assignments.  Furthermore,  detected  inconsistencies  arc  propagated  to 
the  propositional  search  engine  by  adding  the  corresponding  inconsistency 
clause  (or,  using  an  explanation  function,  a  good  over-approximation  of  the 
minimally  inconsistent  set  of  atoms  in  the  logical  context). 

Figure  5  contains  the  main  loop  of  the  Davis-Putnam  procedure  found  in 
most  Boolean  SAT  solvers  [17].  The  algorithm  stalls  with  an  empty  boolean 
assignment,  and  traverses  the  space  of  truth  assignments  implicitly  using 
a  backtrack  search  algorithm.  The  search  process  iteratively  performs  the 
following  steps:  extends  the  current  assignment  by  making  a  decision  as¬ 
signment  to  an  unassigned  variable  (procedure  decide)',  extends  the  current 
assignment  by  following  logical  consequences  of  the  assignments  made  so 
far  (procedure  bep),  the  deduction  process  may  also  identify  and  return  a  con¬ 
flicting  clause  (variable  cc),  implying  that  the  current  assignment  is  not  satis- 
fi  able;  undoes  ( backtracks )  the  current  assignment,  if  a  conflict  was  detected, 
thus  allowing  another  assignment  to  be  tried  (procedure  conflict-resolution). 
The  procedure  bep  implements  the  boolean  constraint  propagation  which  cor¬ 
responds  to  the  application  of  the  unit  clause  rule  proposed  by  M.  Davis  and 
H.  Putnam  [9]. 

As  described  above,  the  explain  function  is  similar  to  the  conflict  res¬ 
olution  procedure  found  in  Boolean  SAT  solvers  [17,  19].  Therefore,  the 
conflict  resolution  procedure  can  be  used  to  refine  the  result  produced  by 
the  explain  function  in  an  online  integration.  For  instance,  suppose  that  the 
explain  function  returns  the  set  {y  >  10,  y  <  3}  as  an  explanation  for 
a  conflict  detected  by  C-sat.  Then,  this  set  is  used  to  build  the  conflicting 
clause  {— i(y  >  10),  -i(y  <  3)},  which  is  sent  to  the  conflict  resolution 
procedure  in  B-sat.  A  conflict  clause  is  then  produced  by  B-sat.  Figure  6 
contains  our  online  algorithm.  The  procedure  propagate-to-C  is  responsible 
to  send  recently  assigned  constraints  to  C-sat,  it  returns  a  conflicting  clause 
if  an  inconsistency  is  detected.  In  other  words,  the  procedure  propagate-to- 
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procedure  C-dpll() 
loop 

status  :=  decide (); 

if  status  =  done  or  shoidd-propagate-to-C  then 
cc  :=  propagate-to-C{)', 
if  cc  /  nil  then 
status  :=  not-done', 
if  not  conflict-resolution{cc )  then 
return  unsatisfiable 
if  status  =  done  then  return  satisfiable 
loop 

cc  :=  bcpQ; 

if  cc  =  nil  then  break 

if  not  conflict-resolution{cc )  then 
return  unsatisfiable 
procedure  propagate-to-C{ ) 

relevant-constrains  :=  collect(<p,  v)\ 
if  C-‘c\^c\X{  relevant-const  rains)  return  nil 
return  c  la  us  ify(  explain!  re  levant -cons  trains)) 


Figure  6.  Online  Integration. 


C,  implements  the  bridge  between  B-sat  and  C-sat.  In  our  online  algorithm, 
the  procedure  collect  behaves  slightly  different,  since  it  must  handle  unas¬ 
signed  variables,  since  u  can  be  a  partial  assignment.  We  also  keep  track  of 
which  constraints  were  already  sent  to  C-sat,  so  the  procedure  collect  only 
collects  the  unsent  constraints.  The  procedure  C -assert  is  an  incremental  ver¬ 
sion  of  procedure  C-sat  in  Figure  1,  that  is,  it  extends  the  logical  context 
of  C-sat  with  the  new  constraints  in  the  variable  relevant-constrains.  The 
procedure  propagate-to-C  is  called  when  a  satisfi  able  boolean  assignment  is 
found  ( decide  returns  done),  or  when  the  flag  should-propagate-to-C  is  active. 
Different  heuristics  can  be  used  to  activate  this  flag,  in  our  implementation  it 
is  activated  every  time  a  given  number  of  new  constraints  are  assigned  to  a 
boolean  value.  So,  if  the  problem  only  contains  propositional  variables,  our 
algorithm  will  behave  like  a  standard  Boolean  SAT  solver.  Although  it  is  not 
described  in  the  Figure  6,  the  procedure  decide  must  request  C-sat  to  create 
a  new  backtracking  point,  and  conflict-resolution  must  request  C-sat  to  exe¬ 
cute  the  backtracking.  Our  integrated  algorithm  is  compatible  with  any  kind 
of  decision  heuristic  and  standard  optimizations  such  as  non-chronological 
backtracking  and  learning  [17], 
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Figure  7.  Bakery  Mutual  Exclusion  Protocol. 


4.  Experiments 

We  implemented  several  refi  nements  of  the  basic  lazy  theorem  proving  algo¬ 
rithm  from  Section  3,  using  Chaff  [19]  for  the  offline  integration,  and  ICS  [12] 
for  deciding  constraints.  ICS  is  a  ground  decision  procedure  for  the  combi¬ 
nation  of  linear  arithmetic  constraints,  the  theory  of  tuples,  arrays,  bitvectors, 
and  equality  over  uninterpreted  functions.  Since  state-of-the-art  Boolean  SAT 
solvers  such  as  Chaff  are  missing  the  necessary  API  for  realizing  such  an 
online  integration,  we  used  a  home-grown  SAT  for  realizing  the  online  in¬ 
tegration.  We  describe  some  of  our  experiments  using  the  Bakery  mutual 
exclusion  protocol  in  Figure  7  2  with  initial  states  yi  >  0  A  y2  >  0.  The 
basic  idea  is  that  of  a  bakery,  where  customers  take  numbers,  and  whoever 
has  the  lowest  number  gets  service  next.  Here,  of  course,  “service”  means 
entry  to  the  critical  section.  In  our  example,  there  are  only  two  processes  (Pi 
and  P2).  The  program  location  03(63)  represents  the  critical  section  of  the 
process  Pi(P2).  The  variable  y\(y2)  contains  the  number  that  P\(P2)  uses  to 
enter  the  critical  section,  it  is  zero  if  the  process  is  not  trying  to  enter  the 
critical  section.  Only  one  process  can  execute  a  transition  at  each  time.  In 
this  example,  we  are  interested  in  the  property  that  the  processes  are  never 
in  their  critical  sections  at  the  same  time.  For  validating  this  property  we  use 
bounded  model  checking  (BMC)  to  search  for  counterexamples  of  length  k  to 
the  model  checking  problem  M  |=  ip,  where  M  is  the  system  (program)  being 
verifi  ed,  and  ip  is  the  mutual  exclusion  property.  This  technique  has  been 
introduced  for  fi  nite  systems  in  [4],  Here,  we  are  working  with  an  extension 
of  the  BMC  methodology  to  infinite-state  systems  [10,  30]. 

We  use  the  convention  that  current  variables  arc  always  written  as  r/i,  y2 
whereas  the  next-state  variables  arc  written  as  y[,  y'2.  In  addition,  xl  repre¬ 
sents  the  value  of  the  variable  x  at  time  i.  The  variable  pc\  (pep  is  the  program 
counter  of  the  process  Pi{P2).  Thus  the  formula  that  describes  the  initial  state 
is: 


pci  =  a\  A  y^  >  0  A  pc2  =  61  A  y2  >  0 

2  See  also  http://www.csl.sri.com/~demoura/bmc-examples. 
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We  want  to  verify  the  property  -i(pci  =  0.3  A  poi  =  63),  thus,  a  coun¬ 
terexample  of  length  k  is  a  trace  that  reaches  the  goal  (pc\  —  03  A  pc2  =  63). 
The  transitions  arc  encoded  as: 


(pc\  =  ai  A  y\+1  =2/3  +  1  A pc\+1  =  a2  A pcz2+1  =  pcl2  A  yl2+1  =  y2)  V 
(p4  =  a-2  A  (r/2  =  0  V  2/1  <  2/2)  A  2/i+1  =  2/i  Apc\+1  =  a3  A 
p4+1  =  a  2/2+1  =  2/2)  v 

(pc'i  =  03  A  2/l+1  =  0  Ap4+1  =  ai  Apc?2+1  =  pc*2  A  2/a+1  =  2/2)  v 
(P4  =  bi  A  2/2+1  =  2/i  +  1  Ap4+1  =  b-2  Apc\+1  =  pc\  A  2/l+1  =  2/1)  V 
(p4  =  b2  A  (?/l  =  0  V  -1(2/1  <  2/2))  A  2/2+1  =  2/2  Apc?2+1  =  63  A 
M+1  =  pc\  A  y\+1  =  2/1)  V 

(p4  =  63  a  2/l+1  =  0  Apc*2+1  =  61  Apcl+1  =  pc\  A  2/l+1  =  2/1) 

This  enconding  includes  th e  frame  axioms  to  describe  which  variables  a 
transition  does  not  affect.  The  program  counter  (pc\  and  7^:2)  can  be  encoded 
using  propositional  variables,  since  their  domains  arc  fr  nite. 


Table  I.  Offlne  lazy  theorem  proving  is  time  >  1800  secs). 


depth 

refi 

ne-2 

refi  ne-3 

refi  ne-4 

time 

conflcts 

time 

conflcts 

time 

conflcts 

5 

45.23 

577 

0.71 

66 

0.31 

16 

6 

83.32 

855 

2.36 

132 

0.32 

18 

7 

286.81 

1405 

12.03 

340 

1.75 

58 

8 

627.90 

1942 

56.65 

710 

2.90 

73 

9 

1321.57 

2566 

230.88 

1297 

8.00 

105 

10 

- 

- 

985.12 

2296 

15.28 

185 

15 

- 

- 

- 

- 

511.12 

646 

Table  I  includes  some  statistics  for  three  different  offline  confr  gurations 
depending  on  which  refine  procedure  described  in  Section  3  is  used.  For 
each  confi  guration,  we  list  the  total  time  (in  seconds)  and  the  number  of 
conflicts  detected  by  the  decision  procedure.  This  table  indicates  that  the 
effort  of  detecting  the  relevant  constraints,  and  the  linear  explain  function  are 
essential  for  effi  ciency.  Recall  that  the  experiments  so  far  represent  worst- 
case  scenarios  in  that  the  given  formulas  arc  unsatisfi  able.  For  BMC  prob¬ 
lems  with  counterexamples,  however,  our  procedure  usually  converges  much 
faster.  Consider,  for  example  the  mutual  exclusion  problem  of  the  Bakery 
protocol  with  the  assignment  y2  :=  y2  +  1  instead  of  y2  :=  yi  +  1.  The 
corresponding  counterexample  for  k  —  7  is  produced  in  a  fraction  of  a  second 
after  adding  53  lemmas. 
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Table  II.  Online  lazy  theorem  proving. 


depth 

no  explain 

explain 

time 

conflcts 

calls  to  ICS 

time 

conflcts 

calls  to  ICS 

5 

0.03 

24 

162 

0.01 

7 

71 

6 

0.08 

48 

348 

0.01 

7 

83 

7 

0.19 

96 

744 

0.02 

7 

94 

8 

0.98 

420 

3426 

0.05 

29 

461 

9 

2.78 

936 

7936 

0.19 

70 

1205 

10 

8.60 

2008 

17567 

0.26 

85 

1543 

15 

- 

- 

- 

4.07 

530 

13468 

(ai,Ah,&i,  k2) 

(<J2,  2  +  ^2,  62,  1  4”  fe) 
(a2, 2  +  fc2,6i,0) 

(03,2  +  ^2,62,!) 


-A  (ai,  A:i,  62,1  +  A:2) 

— >■  (02,  2  +  &2,  63, 1  +  fo) 
— >  (03,  2  +  ^2,  61,  0) 

■A  (03,2  +  ^2,63,!) 


Notice  that  this  counterexample  represents  a  family  of  traces,  since  it  is 
parametrized  by  (newly  introduced  constants)  k\  and  /;:2  with  k\ ,  62  >  0. 

The  results  of  using  this  online  integration  for  the  Bakery  example  can  be 
found  in  Table  II  for  two  different  confi  gurations.3  For  each  confi  guration, 
we  list  the  total  time  (in  seconds),  the  number  of  conflicts  detected  by  ICS, 
and  the  total  number  of  calls  to  ICS.  Altogether,  using  an  explanation  facil¬ 
ity  clearly  pays  off  in  that  the  number  of  refi  nement  iterations  (conflicts)  is 
reduced  considerable. 


5.  Related  Work 

For  the  special  case  of  equality  theories  over  terms  with  uninterpreted  func¬ 
tion  symbols,  Ackermann  [1]  already  defi  ned  a  reduction  to  Boolean  logic  by 
adding  propositional  encodings  of  all  relevant  instances  of  the  congruence 
axiom.  Variations  of  Ackermann's  trick  have  been  used,  for  example,  by 
Shostak  [28]  for  arithmetic  reasoning  in  the  presence  of  uninterpreted  func¬ 
tion  symbols,  and  various  reductions  of  the  satisfi  ability  problem  of  Boolean 
formulas  over  the  theory  of  equality  with  uninterpreted  function  symbols  to 
propositional  SAT  problems  have  recently  been  described  by  Goel,  Sajid, 
Zhou,  and  Aziz  [13],  by  Pnueli,  Rodeh,  Shtrichman,  and  Siegel  [23],  and  by 

The  differences  in  the  number  of  conflcts  compared  to  Table  I  are  due  to  the  different 
heuristics  of  the  SAT  solvers  used. 
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Bryant,  German,  and  Velev  [5].  In  a  similar  vein,  an  eager  reduction  to  propo¬ 
sitional  logic  for  constraints  in  Pratt’s  difference  logic  have  been  described 
by  Strichman,  Seshia,  and  Bryant  [31].  Even  for  such  a  simple  constraint 
theory,  however,  an  exponential  number  of  constraints  may  be  generated  in 
the  preprocessing  stage. 

Compared  with  these  eager  reductions,  our  lazy  integration  procedure 
uniformly  works  for  logics  with  a  rich  set  of  data  types.  Moreover,  instead 
of  constructing  an  equisatisfi  able  Boolean  formula  a  priori,  we  compute  a 
sequence  of  refi  nements  by  adding  propositional  lemmas  as  obtained  from 
an  analysis  of  spurious  propositional  assignments.  In  this  way,  the  semantics 
of  constraints  is  introduced  gradually  and  on  on  demand.  In  this  way,  only 
inconsistency  lemmas  of  relevance  to  the  satisfi  ability  of  the  formula  are 
added. 

In  research  that  is  most  closely  related  to  ours,  Barrett,  Dill,  and  Stump  [3] 
describe  an  integration  of  Chaff  with  CVC  by  abstracting  the  Boolean  con¬ 
straint  formula  to  a  propositional  approximation,  then  incrementally  refi  n- 
ing  the  approximation  based  on  diagnosing  conflicts  using  theorem  proving, 
and  fi  nally  adding  the  appropriate  conflict  clause  to  the  propositional  ap¬ 
proximation.  This  integration  corresponds  directly  to  an  online  integration 
in  the  lemmas  on  demand  paradigm.  Their  approach  to  generate  good  expla¬ 
nations  is  different  from  ours  in  that  they  extend  CVC  with  a  capability  of 
abstract  proofs  for  over-approximating  minimal  sets  of  inconsistencies.  Also, 
optimizations  based  on  don’t  cares  arc  not  considered  explicitly  in  [3],  The 
experimental  results  in  [3]  coincide  with  ours  in  that  they  suggest  that  lazy 
theorem  proving  without  explanations  (there  called  the  naive  approach)  and 
offline  integration  quickly  become  impractical.  Using  equivalence  checking 
for  pipelined  microprocessors,  speedups  of  several  orders  of  magnitude  over 
their  earlier  SVC  system  are  obtained. 

Armando,  Castellini,  and  Giunchiglia  [2]  propose  a  SAT-based  approach 
for  the  special  case  of  solving  disjunctions  of  Pratt’s  difference  constraints.  In 
their  experiments,  they  observe  excessively  redundant  computations,  which 
can  largely  be  eliminated  using  our  explanation  capabilities.  A  preprocessing 
step  for  computing  inconsistency  clauses  with  two  literals  is  used  [2]  to  sim¬ 
plify  problems.  We  also  found  it  to  often  to  be  advantageous  to  pregenerate 
2-  and  even  3-inconsistencies  to  accelerate  convergence.  Optimizations  based 
on  don’t  cares  are  not  considered  in  [2]. 


6.  Conclusion 

The  main  contribution  of  this  paper  is  a  lazy  integration  of  propositional  SAT 
solvers  with  constraint  solvers  for  effectively  deciding  the  satisfi  ability  prob¬ 
lem  for  propositional  constraint  formulas.  The  key  idea  is  to  use  constraint 
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solvers  for  suggesting,  on  demand,  useful  inconsistency  lemmas.  In  this  way, 
only  inconsistency  lemmas  of  relevance  to  the  satisfi  ability  of  the  formula 
arc  added.  Various  refi  nements  such  as  online  integration  and  acceleration  of 
convergence  using  explanation  functions  arc  needed  to  make  the  lemmas  on 
demand  approach  work  effectively  in  practice. 
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Abstract.  Formal  analyses  can  provide  valuable  assurance  for  high  confidence 
software  and  systems.  The  analyses  can  range  from  strong  typechecking  through 
test  case  generation  and  static  analysis  to  model  checking  and  full  verification.  In 
all  cases,  the  tools  that  support  the  analyses  use  formal  deduction  in  some  way  or 
other.  ICS  is  a  fully  automatic,  high-performance  decision  procedure  for  a  broad 
combination  of  theories  that  can  be  embedded  in  all  tools  of  this  kind  to  provide 
them  with  a  core  deductive  capability  of  exceptional  power  and  performance.  We 
describe  the  design  choices  underlying  ICS  and  the  capabilities  it  provides. 


1  Introduction 

Formal  deduction — that  is,  automated  theorem  proving — lies  at  the  heart  of  all  tools 
for  formal  analysis  of  software  and  system  descriptions.  In  formal  verification  systems 
such  as  PVS  [10],  the  deductive  capability  is  explicit  and  visible  to  the  user,  whereas  in 
tools  such  as  test  case  generators  it  is  hidden  and  often  ad-hoc.  We  believe  that  all  tools 
for  formal  analysis  would  benefit — both  in  performance  and  ease  of  construction — if 
they  could  draw  on  a  powerful  embedded  service  to  perform  common  deductive  tasks. 

Examples  of  the  tasks  that  can  be  required  are  those  that  ask  whether  one  formula 
is  a  consequence  of  others  (e.g.,  is  4  x  x  =  2  a  consequence  of  x  <  y,  x  <  1  —  y, 
and  2  x  x  >  1  when  the  variables  range  over  the  reals?),  and  those  that  ask  whether 
an  assignment  to  variables  can  be  found  that  satisfies  a  set  of  constraints  (e.g.,  find  an 
a  such  that  car(a)  =  cons(b,  c)).  The  first  task  is  a  decision  problem  that  might  arise 
in  verification,  the  second  is  a  constraint  satisfaction  problem  that  could  arise  in  test 
case  generation.  Notice  that  both  examples  involve  interpreted  theories:  rational  linear 
arithmetic  in  the  first,  and  lists  in  the  second. 

An  embedded  deductive  service  should  be  fully  automatic,  and  this  suggests  that  its 
focus  should  be  restricted  to  those  theories  whose  decision  and  satisfiability  problems 
are  decidable.  However,  there  are  some  contexts  that  can  tolerate  incompleteness  (e.g., 
in  extended  static  checking,  the  failure  to  prove  a  true  theorem  results  only  in  a  spurious 
warning  message),  and  others  where  speed  may  be  favored  over  completeness  (e.g.,  in 
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construction  of  abstractions),  so  that  undecidable  theories  (e.g.,  nonlinear  integer  arith¬ 
metic)  and  those  whose  decision  problems  are  often  considered  infeasible  in  practice 
(e.g.,  real  closed  fields)  should  not  be  ruled  out  completely. 

Most  problems  that  arise  in  practice  involve  combinations  of  theories:  the  question 
whether 


f(cons( 4  x  car(x)  —  2  x  f(cdr(x)),y))  =  f(cons( 6  x  cdr(x),y )) 

follows  from  2  x  car(x)  —  3  x  cdr(x )  =  f(cdr(x)),  for  example,  requires  simulta¬ 
neously  the  theories  of  uninterpreted  functions,  linear  arithmetic,  and  lists.  The  ground 
(i.e.,  quantifier-free)  fragment  of  many  combinations  is  decidable  when  the  full  (i.e., 
quantified)  combination  is  not,  and  practical  experience  indicates  that  automation  of 
the  ground  case  is  adequate  for  most  applications. 

Practical  experience  also  suggests  several  other  desiderata  for  an  effective  deductive 
service.  Some  applications  (e.g.,  construction  of  abstractions)  invoke  their  deductive 
service  a  huge  number  of  times  in  the  course  of  a  single  calculation,  so  that  perfor¬ 
mance  of  the  service  must  be  very  good  (e.g.,  tens  or  hundreds  of  thousands  of  invoca¬ 
tions  per  second).  Other  applications  (e.g.,  proof  search)  explore  many  variations  on  a 
formula  (i.e.,  alternately  asserting  and  denying  various  combinations  of  its  premises), 
so  the  deductive  service  should  not  examine  individual  formulas  in  isolation,  but  should 
provide  a  rich  API  that  supports  incremental  assertion,  retraction,  and  querying  of  for¬ 
mulas.  Other  applications  (e.g.,  test  case  generation)  generate  propositionally  complex 
formulas  (i.e.,  formulas  with  thousands  or  millions  of  propositional  connectives  applied 
to  terms  over  the  decided  theories),  so  that  this  type  of  proof  search  must  be  performed 
efficiently  inside  the  deductive  service. 

We  have  developed  a  system  called  ICS  (the  name  stands  for  Integrated  Canon- 
izer/Solver )  that  can  be  embedded  in  applications  to  provide  deductive  services  satis¬ 
fying  the  desiderata  above.  In  the  following  sections,  we  outline  the  design  choices 
embodied  in  ICS,  its  capabilities  and  method  of  operation,  and  describe  some  of  its 
applications. 

2  Core  ICS 

The  core  of  ICS  is  a  decision  procedure  for  a  combination  of  ground  theories  includ¬ 
ing  equality  with  function  symbols,  integer  and  rational  linear  arithmetic,  fixed-length 
bitvectors,  arrays,  tuples,  and  coproducts  (the  combination  of  the  last  two  provides  ab¬ 
stract  datatypes  such  as  lists  and  binary  trees).  Apart  from  bitvectors,  this  capability  is 
similar  to  that  of  the  decision  procedures  in  PVS  (e.g.,  the  assert  command),  but  ICS 
can  handle  much  larger  formulas. 

It  is  crucial  to  its  utility  that  ICS  is  able  to  decide  a  combination  of  theories.  It  is 
desirable  to  achieve  this  by  combining  decision  procedures  for  its  individual  theories  in 
a  modular  fashion.  However,  there  is  a  tradeoff  between  modularity  and  performance. 
The  combination  method  of  Nelson  and  Oppen  [9],  for  example,  imposes  few  restric¬ 
tions  on  its  component  theories  and  their  decision  procedures,  but  yields  relatively  low 
performance.  This  is  because  the  separate  decision  procedures  do  not  share  much  state 
and  communicate  only  by  propagating  newly  discovered  equalities  back  and  forth.  The 
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combination  method  of  Shostak  [14],  on  the  other  hand,  requires  that  its  component 
theories  are  canonizable  and  solvable,  and  achieves  high  performance  by  tightly  inte¬ 
grating  these  components  through  an  efficient  data  structure  for  congruence  closure. 
Most  theories  of  practical  interest  are  canonizable  and  solvable,  so  ICS  uses  a  corrected 
version  of  Shostak’s  method.  Theories  that  do  not  satisfy  the  requirements  for  Shostak’s 
method  can  be  integrated  using  Nelson  and  Oppen’s  method  above  the  Shostak  combi¬ 
nation. 

As  mentioned,  an  efficient  data  structure  and  procedure  for  congruence  closure  lies 
at  the  heart  of  ICS.  This  provides  a  decision  procedure  for  the  theory  of  equality  with 
uninterpreted  function  symbols,  and  is  used  to  integrate  decision  procedures  for  other 
canonizable  and  solvable  theories.  Early  treatments  of  this  integration  were  incorrect 
and  could  yield  incomplete  or  nonterminating  procedures.  The  first  correct  treatment 
for  the  integration  of  congruence  closure  with  one  other  theory  was  developed  by 
Shankar  and  Ruel.5  [12];  this  construction  has  been  formally  verified  in  PVS  by  Ford  and 
Shankar  [6],  The  extension  to  multiple  theories  is  not  straightforward  because,  although 
the  combination  of  the  canonizers  for  the  constituent  theories  yields  a  canonizer  for  the 
combined  theory  (which  is  an  independently  useful  artifact),  the  combination  of  the 
solvers  may  not  (contrary  to  previous  belief)  be  a  solver  for  the  combination.  The  first 
correct  extension  to  multiple  theories  also  was  developed  by  Shankar  and  Ruel.i  [13]. 

A  decision  procedure  (i.e.,  canonizer  and  solver)  for  rational  linear  arithmetic  is 
quite  straightforward  and  efficient,  but  integer  linear  arithmetic  is  more  challenging 
because  it  can  require  case-splitting  (i.e.,  search)  to  determine  whether  some  property 
is  satisfied  by  an  integer  in  a  certain  range  (hence,  the  problem  is  NP-complete).  There 
are  straightforward  methods  for  this  problem  that  are  easily  shown  to  be  complete  (e.g., 
the  method  of  Fourier-Motzkin),  but  they  are  inefficient  on  cases  that  commonly  arise 
in  practice  (e.g.,  constraints  of  the  form  x  —  y  <  c,  where  x,  y  are  variables  and  c 
is  an  integer  constant).  ICS  uses  a  new  method  that  is  efficient  on  the  common  cases, 
complete,  and  smoothly  extensible  to  richer  fragments  such  as  nonlinear  arithmetic. 

Verification  and  model  checking  for  hardware  generally  involve  reasoning  over 
bitvectors.  It  is,  of  course,  possible  to  treat  each  bit  as  a  Boolean  variable  and  then 
use  an  efficient  decision  procedure  for  the  Booleans,  but  this  immediately  invites  an 
exponential  case  explosion.  A  better  method  is  to  split  the  bitvectors  into  chunks  (not 
individual  bits)  and  to  do  so  only  when  necessary.  ICS  uses  a  method  of  this  kind  for 
fixed-length  bitvectors  [2,7]  and  integrates  it  with  integer  arithmetic  for  their  numerical 
(e.g.,  unsigned  and  twos-complement)  interpretations. 

In  addition  to  the  theories  described  above,  ICS  also  decides  the  theories  of  ar¬ 
rays,  tuples,  and  coproducts;  the  combination  of  the  latter  two  can  represent  abstract 
datatypes  such  as  lists  and  binary  trees. 

Core  ICS  operates  as  a  decision  procedure:  it  reports  whether  the  formula  under 
consideration  is  valid — which  is  equivalent  to  its  negation  being  unsatisfiable.  In  the 
case  that  a  formula  is  satisfiable,  the  ICS  data  structures  contain  sufficient  information 
to  extract  a  satisfying  assignment — although  this  is  not  yet  implemented. 
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3  ICS  with  SAT 


Core  ICS  operates  on  formulas  that  are  conjunctions  of  terms  in  the  combination  of  its 
theories.  However,  many  applications  generate  proof  obligations  or  constraints  that  have 
richer  propositional  structure.  For  example,  a  test  case  of  length  2  for  a  shift  register  may 
reduce  to  satisfiability  of  the  following  formula. 

{xi  =  x0[l  :  n  —  1]  ++li)  A  (x2  =  Xi[l  :  n  —  1]  ++li)  A 

(zo  7^  On  V  X±  7^  0 n  V  X2  7^  On)  A  ( Xq  —  X2  V  X\  —  ^2)- 

where  x’[l  :  r]  denotes  extraction  of  bits  1  through  r  of  the  bitvector  x  of  length  n, 
-H-  denotes  bitvector  concatenation,  and  lr  (resp.  0r)  denotes  the  bitvector  of  length  r 
whose  bits  are  all  1  (resp.  0). 

The  disjunctions  in  formulas  such  as  this  necessitate  search  and  the  challenge  is  to 
integrate  this  capability  with  core  ICS.  The  PVS  ground  command  provides  modest 
functionality  of  this  type  with  the  assistance  of  an  external  BDD  package.  The  problem 
with  this  approach  is  that  the  BDD  represents  all  possible  satisfying  assignments  (and 
is  therefore  expensive  to  construct),  whereas  we  would  be  satisfied  with  just  one  (or 
the  knowledge  that  there  are  none).  Propositional  satisfiability  solvers  (SAT  solvers) 
provide  this  more  targeted  type  of  search  and  recent  advances  have  made  them  extraor¬ 
dinarily  fast  for  many  problems  that  arise  in  practice — often  they  are  able  to  discharge 
formulas  with  hundreds  of  thousands  of  variables  and  millions  of  terms  in  seconds  or  a 
few  minutes  [8], 

To  connect  core  ICS  to  a  SAT  solver,  we  use  variable  abstraction:  each  interpreted 
term  (e.g.,  x\  =  Xq[\  :  n  —  1]  ++li)  is  replaced  by  a  distinct  propositional  variable 
(e.g.,  p)  and  the  SAT  solver  is  asked  to  solve  the  resulting  propositional  system.  The 
truth  values  assigned  to  the  propositional  variables  by  the  SAT  solver  are  then  extended 
to  their  original  interpretations  and  the  core  ICS  decision  procedure  checks  them  for 
consistency.  If  the  interpretations  are  consistent,  then  we  are  done;  if  not,  the  root  of  the 
inconsistency  can  be  generated  and  passed  to  the  SAT  solver  as  an  additional  constraint 
(we  call  this  the  generation  of  “lemmas  on  demand”  [3]).  For  example,  if  p  represents 
the  term  x  =  y,  q  represents  f(x)  =  f(y ),  and  the  SAT  solver  returns  p,  ~^q,  then  core 
ICS  will  detect  the  inconsistency  in  the  interpretation  x  =  y  A  /( x)  ^  f(y)  and  can 
generate  the  lemma  ->p  V  q  as  a  new  constraint  for  the  SAT  solver.  Proceeding  back  and 
forth  in  this  way,  the  SAT  solver  generates  new  candidate  assignments  and  the  deci¬ 
sion  procedure  generates  new  additional  constraints  until  either  we  find  an  assignment 
whose  interpretation  is  satisfiable,  or  the  set  of  constraints  becomes  unsatisfiable.  The 
effectiveness  of  this  approach  depends  on  how  rapidly  the  search  space  is  cut  down  at 
each  stage  by  the  new  constraints  generated  by  the  decision  procedure.  The  most  potent 
constraints  would  be  the  true  “root  causes”  of  the  inconsistencies  detected  at  each  stage 
but  it  can  take  a  long  time  to  calculate  such  precise  constraints  and  this  negates  the  sav¬ 
ings  due  to  the  smaller  search  space.  Good  overall  performance  is  obtained  using  fast 
heuristics  that  generate  an  approximate  “explanation”  for  the  root  cause  of  each  incon¬ 
sistency  [3].  We  are  still  tuning  our  heuristics  in  search  of  the  best  overall  performance. 

Full  ICS  integrates  the  combined  decision  procedure  of  core  ICS  with  a  SAT  solver 
in  the  manner  described.  We  do  not  use  an  off-the-shelf  SAT  solver  because  the  back- 
and-forth  interaction  with  the  decision  procedure  imposes  novel  requirements  (e.g.,  we 
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want  to  process  new  constraints  incrementally  from  the  current  state,  not  restart  from 
the  beginning,  and  we  also  use  “don’t  care”  assignments),  but  we  do  employ  many 
of  the  techniques  that  make  such  solvers  fast  [15],  Our  experiments  indicate  that  the 
integrated  SAT  solver  in  ICS  yields  several  orders  of  magnitude  improvement  over  a 
looser  combination  using  an  off-the-shelf  SAT  solver.  Used  purely  as  a  SAT  solver,  the 
performance  of  full  ICS  is  comparable  to  Chaff  [8], 

Like  core  ICS,  full  ICS  operates  as  a  decision  procedure,  but  we  plan  to  extend  it  to 
a  satisfiability  procedure  in  the  near  future. 

4  Using  ICS 

Core  ICS  is  implemented  in  Objective  Caml,  and  its  SAT  solver  in  C++;  the  full  sys¬ 
tem  functions  as  a  C  library  and  can  be  called  from  virtually  any  language.  We  have 
experience  using  it  from  C,  C++,  Lisp,  Scheme,  and  Objective  Caml.  The  system  was 
developed  under  Linux  but  has  been  ported  to  MAC  OS  X  and  to  Windows  XP  (under 
cygwin),  and  we  anticipate  little  difficulty  in  porting  it  to  other  systems. 

In  addition  to  its  C  interface,  ICS  is  provided  with  a  simple  text-based  interactor 
that  can  be  used  for  experimenting  with  its  capabilities.  ICS  maintains  a  state  that  can 
be  manipulated  and  queried  by  a  series  of  commands.  Most  importantly,  the  assert 
command  extends  the  current  state  with  a  new  fact.  The  following  command,  for  exam¬ 
ple,  adds  an  equality  over  terms  built  from  the  the  variable  x,  the  uninterpreted  function 
symbol  f ,  the  operators  of  linear  arithmetic,  and  S-expressions  built  from  the  pairing 
function  cons  ( . ,  . )  and  its  first  and  second  projections  car  ( . )  and  cdr  (  . ) . 

ics>  reset. 

:  ok 

ics>  assert  2  *  car(x)  -  3  *  cdr(x)  =  f(cdr(x)). 

:  ok 


We  can  now  assert  a  second  equality,  and  the  response  valid  indicates  that  this  is 
deduced  to  be  a  consequence  of  the  previously  asserted  facts. 

ics>  assert  f(cons(4  *  car(x)  -  2  *  f(cdr(x)),  y) ) 

=  f (cons (6  *  cdr(x),  y) ) . 

: valid 


The  command  sat  invokes  the  SAT  solver  (here  |  denotes  disjunction  and  &  is 
conjunction). 

ics>  sat  (x=l  |  x=2  |  x=3)  &x>l. 

:sat(s5)  [-1  +  x  >  0;  x  =  3] 

The  response  from  ICS  indicates  that  all  assignments  to  x  satisfying  both  -1  +  x 
>  0  and  x  =  3,  describe  models  for  the  input  formula  (the  annotation  s5  simply 
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names  this  logical  state).  There  is  obviously  only  one  possible  assignment  here,  so  the 
description  is  not  minimal.  Construction  of  concrete  satisfying  assignments  is  planned 
for  the  near  future. 


5  Applications  of  ICS 

ICS  can  be  used  to  provide  embedded  deductive  support  for  existing  applications,  but 
its  speed  and  power  also  make  new  applications  possible.  We  describe  representatives 
applications  of  each  kind. 

5.1  Discharging  Proof  Obligations 

ICS  can  be  used  to  augment  or  replace  existing  deductive  capabilities  in  systems  that 
generate  and  discharge  proof  obligations. 

For  example,  ICS  can  be  used  in  place  of  the  standard  decision  procedures  in  PVS. 
Because  the  standard  decision  procedures  have  different  capabilities  than  ICS,  a  PVS 
proof  script  developed  using  the  former  will  generally  require  adjustment  to  work  with 
the  latter.  For  testing  and  benchmarking  purposes,  we  have  run  PVS  in  a  mode  where 
proof  scripts  are  guided  by  the  standard  decision  procedures,  but  ICS  is  run  in  parallel 
and  its  behavior  compared  with  the  standard  procedures.  Differences  were  examined 
to  ensure  they  were  intended.  We  used  proofs  of  the  750  theorems  in  the  PVS  pre¬ 
lude  (built-in  library)  as  our  test  bench.  Despite  its  more  costly  interface  (PVS  and  its 
standard  decision  procedures  are  implemented  in  Lisp,  from  which  ICS  is  invoked  as  a 
foreign-function  through  its  C  interface)  and  the  fact  that  PVS  uses  only  its  core  capabil¬ 
ities,  ICS  is  substantially  faster  on  examples  that  really  exercise  the  decision  procedures 
(for  small  examples,  any  differences  are  swamped  by  the  overhead  of  other  processing 
in  the  PVS  prover).  Future  versions  of  PVS  will  make  fuller  use  of  ICS  capabilities.  We 
anticipate  that  this  will  be  beneficial  both  to  users  of  PVS  and  to  those  who  intend  to  use 
ICS  directly  but  wish  to  use  PVS  to  explore  and  prototype  the  deductive  “glue”  needed 
to  reduce  their  application  to  the  capabilities  provided  by  ICS.  Such  glue  is  likely  to 
involve  Skolemization  (and  possibly  quantifier  instantiation),  and  definition  expansion 
(and  possibly  rewriting). 

We  are  currently  optimizing  the  capabilities  of  ICS  to  support  the  deductive  require¬ 
ments  of  the  Destiny  verification  system  under  development  at  NS  A. 

5.2  Bounded  Model  Checking  and  Test  Case  Generation 

Bounded  model  checking  (BMC)  has  become  a  popular  debugging  and  assurance  method 
for  hardware  designs  [1],  Bounded  model  checking  asks  whether  there  is  a  counterex¬ 
ample  of  length  k  or  less  to  a  given  property  P  (typically  an  invariant,  but  the  method 
works  for  full  linear  temporal  logic)  of  a  design  represented  as  an  initiality  predicate 
I  and  transition  relation  T.  For  hardware  designs  at  the  register  transfer  level,  P,  I, 
and  T  are  represented  directly  in  propositional  calculus  and  the  BMC  problem  then 
reduces  to  a  (typically,  huge)  SAT  problem.  The  performance  of  modern  SAT  solvers 
allows  BMC  to  find  deeper  bugs  on  bigger  designs  than  a  standard  BDD-based  symbolic 
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model  checker.  More  importantly,  BMC  requires  less  tinkering  (e.g.,  variable  ordering, 
downscaling)  by  the  user  than  standard  model  checking.  Typically,  the  process  is  to  try 
k  =  1,  then  k  =  2, 3, . . .  until  either  a  counterexample  is  found,  or  the  resources  of  the 
computer — or  the  patience  of  the  user — are  exhausted. 

Full  ICS  immediately  allows  BMC  to  be  extended  from  hardware  designs  consisting 
of  purely  Boolean  circuits  to  software  and  system  designs  (and  hardware  designs  at 
higher  levels  of  description)  whose  state  is  defined  over  integers,  arrays,  bitvectors,  and 
datatypes,  and  their  corresponding  operations — in  short,  over  any  combination  of  the 
theories  decided  by  ICS.  We  call  this  “Infinite  BMC”  since  the  state  space  is  potentially 
infinite  [5], 

Given  a  system  specified  by  initiality  predicate  I  and  transition  relation  T,  there  is 
a  counterexample  of  length  k  to  invariant  P  if  there  is  a  sequence  of  states  So, ...  ,sk 
such  that 


I (so)  A  T(s0,  si)  A  T(si,  s2)  A  •  •  •  A  T(sk-i,sk)  A  ~^P(sk). 

The  Infinite  BMC  problem  is  simply  to  find  a  satisfying  assignment  for  so,...,sk  in 
this  formula — which  is  exactly  the  capability  of  ICS.  1 

Using  correct  designs  supplied  for  evaluation  purposes  by  an  industrial  collaborator 
(they  are  hardware  designs,  but  we  do  not  know  their  origins  or  purpose),  we  performed 
Infinite  BMC  for  increasing  k  until  the  time  taken  by  ICS  approached  30  minutes  (on  a 
2GHz  Pentium  IV  with  1GB  of  memory).  At  this  point,  one  of  the  BMC  formulas  had 
227,108  terms  and  its  representation  as  a  text  file  occupied  5Mb,  another  had  105,844 
terms  and  a  3Mb  text  file,  while  a  third  had  72,291  terms  and  a  2Mb  text  file.  In  all  cases, 
ICS  required  less  than  80  Mb  of  memory.  Observe  that  these  are  worst-case  examples: 
the  designs  are  correct  (for  the  invariants  concerned)  and  hence  the  BMC  formulas  have 
no  satisfying  assignments  and  the  full  search  space  must  be  explored.  Other  invariants 
do  manifest  bugs  in  the  second  of  the  designs  mentioned  above,  and  ICS  found  a  coun¬ 
terexample  to  one  of  them  of  length  4,  and  a  counterexample  to  another  of  length  6, 
both  in  under  a  minute. 

Structural  test  coverage  criteria,  including  the  MC/DC  criterion  required  for  flight 
control  software,  can  be  specified  as  formulas  in  temporal  logic  [11].  Counterexam¬ 
ples  to  the  negation  of  these  formulas  then  constitute  suitable  test  cases.  Experiments 
with  symbolic  model  checkers  have  shown  that  they  can  be  used  within  this  framework 
as  very  effective  test  case  generators.  Bounded  model  checkers  should  be  even  more 
effective  (since  they  are  specialized  to  the  efficient  construction  of  counterexamples). 
However,  these  strictly  Boolean  and  propositional  methods  apply  only  to  Boolean  ab¬ 
stractions  of  software  designs  specified  over  arithmetic  variables  and  data  structures 
and  can  therefore  generate  infeasible  test  cases.  Infinite  BMC  using  ICS  can  be  applied 
directly  to  software  designs,  thereby  eliminating  infeasible  test  cases  and  achieving  ac¬ 
curate  coverage. 

1  As  noted  earlier,  ICS  currently  operates  as  a  decision  procedure:  it  can  indicate  whether  a 
formula  is  valid  or,  equivalently,  whether  its  negation  is  unsatisfiable.  In  the  case  that  the 
negation  to  a  formula  is  satisfiable,  ICS  does  not  yet  produce  a  satisfying  assignment  (i.e.,  a 
concrete  counterexample  to  the  original  formula).  However,  the  Infinite  BMC  procedure  does 
extract  “symbolic  counterexamples”  from  information  in  the  ICS  data  structures. 
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5.3  fc-Induction 


If  BMC  finds  a  counterexample  of  length  k,  then  we  have  found  a  bug,  and  are  done. 
But  if  we  fail  to  find  a  counterexample  for  any  k  up  to  some  limit  on  our  resources  or 
patience,  we  cannot  conclude  that  we  have  verified  the  design — for  there  could  always 
be  a  counterexample  of  length  longer  than  any  that  we  tried.2  To  verify  the  design  (for 
safety  property  P),  we  must  perform  some  kind  of  inductive  argument  that  applies  to 
traces  of  all  lengths.  The  usual  way  to  do  this  by  theorem  proving  is  to  establish  that 
the  property  concerned  is  inductive :  that  is,  it  is  true  of  all  initial  states  (i.e.,  I(s)  D 
P(s))  and  if  it  is  true  of  some  state,  then  it  is  true  of  all  its  successors  (i.e.,  P(s )  A 
T(s,t)  D  P(t)).  The  weakness  of  this  method  is  that  the  second  condition  may  be 
violated  by  a  state  s  that  is  unreachable  from  an  initial  state.  We  must  then  replace  P 
by  a  stronger  property  that  excludes  the  troublesome  state  s  and  repeat  the  process.  It  is 
not  uncommon  to  have  to  iterate  this  process  many  tens  of  times.  Strengthening  often 
requires  human  insight,  though  a  good  heuristic  is  often  to  conjoin  to  P  a  formula  that 
asserts  that  s  is  unreachable. 

A  stronger  form  of  induction  requires  that  only  when  we  have  a  sequence  of  k  states 
satisfying  P  must  all  the  successors  also  satisfy  P.  This  is  called  fc-induction,  and  it 
combines  well  with  BMC:  we  first  perform  BMC  of  depth  k  and  if  that  fails  to  refute  the 
formula,  we  try  /c-induction  (the  formulas  generated  are  very  similar  to  those  for  BMC), 
and  if  that  fails,  we  repeat  the  process  for  k  +  1  (k  +  1-induction  is  stronger — proves 
more  formulas — than  fc -induction).  Subject  to  certain  side  conditions  (for  example,  the 
initial  fc-sequence  should  be  acyclic),  /.-induction  is  a  complete  method  for  finite-state 
systems.  These  results  generalize  from  the  finite-  to  infinite-state  case  when  ICS  is 
substituted  for  a  SAT  solver,  and  the  method  becomes  complete  for  important  classes 
of  infinite-state  systems,  such  as  timed  automata  [4], 

Our  Infinite  BMC  procedure  built  on  ICS  has  been  extended  to  perform  /.-induction 
(with  additional  optimizations-e.g.,  requiring  that  only  the  first  state  in  a  sequence  may 
be  an  initial  state)  and  to  strengthen  invariants  (using  the  heuristic  described  earlier). 
Standard  examples  such  as  the  abstracted  Futurebus  and  Illinois  cache  coherence  pro¬ 
tocols  are  verified  in  seconds  by  this  method,  and  standard  timed  automata  examples 
such  as  the  Fischer  protocol  and  train  gate  controller  are  verified  in  fractions  of  a  second. 
These  results  suggest  that  ICS  can  be  competitive  with  specialized  systems  operating  in 
their  own  domains. 

6  Conclusion 

ICS  packages  a  powerful  and  efficient  set  of  deductive  capabilities  in  the  form  of  a  C 
library  that  can  easily  be  accessed  by  other  applications.  This  makes  deduction  available 
as  an  embedded  capability,  whereas  previously  it  was  available  only  through  theorem 
provers  intended  for  standalone  operation. 

2  For  some  examples,  it  is  possible  to  compute  a  completeness  threshold,  such  that  failure  to 
find  a  counterexample  shorter  than  the  threshold  is  sufficient  for  verification.  However,  for 
most  examples  in  practice,  it  is  either  too  expensive  to  compute  the  threshold,  or  its  value  is 
beyond  the  reach  of  BMC. 


Powerful  embedded  deduction  will  allow  many  conventional  tools  to  provide  new 
capabilities,  or  more  potent  forms  of  existing  capabilities,  at  little  cost.  For  example,  a 
compiler  can  perform  truly  accurate  common  subexpression  detection  by  asserting  the 
path  predicates  to  ICS,  then  using  its  canonizer  to  compare  subexpressions. 

Simple  formal  analysis  tools  (e.g.,  completeness  and  consistency  checkers  for  tabu¬ 
lar  specifications,  test  case  generators,  and  bounded  model  checkers)  can  obtain  most  of 
their  deductive  support  from  ICS,  with  little  deductive  “glue”  needed  in  the  application. 

We  plan  to  enlarge  the  services  provided  by  ICS  so  that  even  less  deductive  glue 
will  be  required  in  future.  In  particular,  we  intend  to  add  quantifier  elimination,  rewrit¬ 
ing  (which  will  also  perform  definition  expansion),  and  forward  chaining  (which  is 
very  effective  for  transitive  relations).  The  quantified  form  of  the  combination  of  theo¬ 
ries  used  in  ICS  is  not  decidable  (e.g.,  quantified  integer  linear  arithmetic — Presburger 
Arithmetic — becomes  undecidable  when  uninterpreted  function  symbols  are  added), 
but  the  circumstances  that  trigger  undecidability  are  sharply  defined  (and  rare  in  prac¬ 
tice)  so  that  it  is  possible  to  decide  a  very  large  and  useful  fragment  of  the  full  theory. 
We  expect  that  our  methods  will  be  heuristically  effective  on  the  undecidable  fragment 
also,  and  on  other  undecidable  extensions  (e.g.,  nonlinear  integer  arithmetic). 

Other  planned  enhancements  include  generation  of  concrete  solutions  to  satisfiabil¬ 
ity  problems  (and  hence  concrete  counterexamples  to  BMC  problems),  and  generation 
of  proof  objects  (independently  checkable  explanations  for  the  decisions  made  by  ICS). 
We  expect  that  the  latter  will  also  improve  the  interaction  between  core  ICS  and  its  SAT 
solver,  and  thereby  further  increase  the  performance  of  full  ICS. 

ICS  focuses  on  providing  full  automation  for  the  cases  where  that  is  effective;  we 
do  not  intend  to  extend  ICS  to  a  general  theorem  prover.  However,  just  as  our  origi¬ 
nal  decision  procedures  made  it  possible  for  PVS  (and  its  NSA-sponsored  predecessor 
Ehdm)  to  have  a  different  architecture  and  style  of  interaction  than  previous  interac¬ 
tive  theorem  provers  [10],  so  the  increased  capability  of  ICS  will  allow  future  systems 
to  support  new  and  more  productive  styles  of  human  interaction.  We  intend  to  explore 
these  opportunities  in  our  research  with  future  versions  of  PVS,  and  to  assist  NS  A  to  do 
the  same  with  its  own  systems. 

ICS  is  freely  available  for  noncommercial  research  purposes  under  license  to  SRI. 
Please  visit  its  home  page  atics.csl.sri.  com. 
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Abstract.  We  explore  the  combination  of  bounded  model  checking  and  induc¬ 
tion  for  proving  safety  properties  of  infinite-state  systems.  In  particular,  we  de¬ 
fine  a  general  fe-induction  scheme  and  prove  completeness  thereof.  A  main  char¬ 
acteristic  of  our  methodology  is  that  strengthened  invariants  are  generated  from 
failed  fc-induction  proofs.  This  strengthening  step  requires  quantifier-elimination, 
and  we  propose  a  lazy  quantifier-elimination  procedure,  which  delays  expen¬ 
sive  computations  of  disjunctive  normal  forms  when  possible.  The  effectiveness 
of  induction  based  on  bounded  model  checking  and  invariant  strengthening  is 
demonstrated  using  infinite-state  systems  ranging  from  communication  protocols 
to  timed  automata  and  (linear)  hybrid  automata. 


1  Introduction 

Bounded  model  checking  (BMC)  [5,4,7]  is  often  used  for  refutation,  where  one  sys¬ 
tematically  searches  for  counterexamples  whose  length  is  bounded  by  some  integer 
k.  The  bound  k  is  increased  until  a  bug  is  found,  or  some  pre-computed  completeness 
threshold  is  reached.  Unfortunately,  the  computation  of  completeness  thresholds  is  usu¬ 
ally  prohibitively  expensive  and  these  thresholds  may  be  too  large  to  effectively  explore 
the  associated  bounded  search  space.  In  addition,  such  completeness  thresholds  do  not 
exist  for  many  infi  nite-state  systems. 

In  deductive  approaches  to  verifi  cation,  the  invariance  rule  is  used  for  establishing 
invariance  properties  ip  [1 1, 10, 13, 3],  This  rule  requires  a  property  ip  which  is  stronger 
than  p  and  inductive  in  the  sense  that  all  initial  states  satisfy  ip,  and  ip  is  preserved 
under  each  transition.  Theoretically,  the  invariance  rule  is  adequate  for  verifying  a  valid 
property  of  a  system,  but  its  application  usually  requires  creativity  in  coming  up  with 
a  suffi  ciently  strong  inductive  invariant.  It  is  also  nontrivial  to  detect  bugs  from  failed 
induction  proofs. 

In  this  paper,  we  explore  the  combination  of  BMC  and  induction  based  on  the  k- 
induction  rule.  This  induction  rule  generalizes  BMC  in  that  it  requires  demonstrating 
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F33615-01-C-1908,  and  NASA  Contract  B09060051. 

**  Also  affiliated  with  University  of  Ulm,  Germany. 
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the  invariance  of  <p  in  the  fi  rst  fc  states  of  any  execution.  Consequently,  error  traces  of 
length  k  are  detected.  This  induction  rule  also  generalizes  the  usual  invariance  rule  in 
that  it  requires  showing  that  if  ip  holds  in  every  state  of  every  execution  of  length  k , 
then  every  successor  state  also  satisfi  es  ip.  In  its  pure  form,  however,  /.'-induction  does 
not  require  the  invention  of  a  strengthened  inductive  invariant.  As  in  BMC,  the  bound 
k  is  increased  until  either  a  violation  is  detected  in  the  fi  rst  k  states  of  an  execution  or 
the  property  at  hand  is  shown  to  be  fc-inductive.  In  the  ideal  case  of  attempting  to  prove 
correctness  of  an  inductive  property,  1-induction  suffi  ces  and  iteration  up  to  a,  possibly 
large,  complete  threshold,  as  in  BMC,  is  avoided.  The  fc-induction  rule  is  sound,  but 
further  conditions,  such  as  the  restriction  to  acyclic  execution  sequences,  must  be  added 
to  make  fc-induction  complete  even  for  fi  nite-state  systems  [17], 

One  of  our  main  contributions  is  the  defi  nition  of  a  general  /.--induction  rule  and  a 
corresponding  completeness  result.  This  induction  rule  is  parameterized  with  respect 
to  suitable  notions  of  simulation.  These  simulation  relations  induce  different  notions  of 
path  compression  in  that  an  execution  path  is  compressed  if  it  does  not  contain  two  sim¬ 
ilar  states.  Many  completeness  results,  such  as  fc-induction  for  timed  automata,  follow 
by  simply  instantiating  this  general  result  with  the  simulation  relation  at  hand.  For  gen¬ 
eral  transition  systems,  we  develop  an  anytime  algorithm  for  approximating  adequate 
simulation  relations  for  fc-induction. 

Whenever  fc-induction  fails  to  prove  a  property  p,  there  is  a  counterexample  of 
length  fc  +  1  such  that  the  fi  rst  fc  states  satisfy  p  and  the  last  state  does  not  satisfy  p.  If 
the  fi  rst  state  of  this  trace  is  reachable,  then  p  is  refuted.  Otherwise,  the  counterexample 
is  labeled  spurious.  By  assuming  the  fi  rst  state  of  this  trace  is  unreachable,  a  spurious 
counterexample  is  used  to  automatically  obtain  a  strengthened  invariant.  Many  infi  nite- 
state  systems  can  only  be  proven  with  fc-induction  enriched  with  invariant  strengthen¬ 
ing,  whereas  for  fi  nite  systems  the  use  of  strengthening  decreases  the  minimal  fc  for 
which  a  fc-induction  proof  succeeds. 

Since  our  invariant  strengthening  procedure  for  fc-induction  heavily  relies  on  elim¬ 
inating  existentially  quantifi  ed  state  variables,  we  develop  an  effective  quantifi  er  elim¬ 
ination  algorithm  for  this  purpose.  The  main  characteristic  of  this  algorithm  is  that  it 
avoids  a  potential  exponential  blowup  in  the  initial  computation  of  a  disjunctive  normal 
form  whenever  possible,  and  a  constraint  solver  is  used  to  identify  relevant  conjunc¬ 
tions.  In  this  way  the  paradigm  of  lazy  theorem  proving,  as  developed  by  the  authors 
for  the  ground  case  [7],  is  extended  to  fi  rst-order  formulas. 

The  paper  is  organized  as  follows.  Section  2  contains  background  material  on  en¬ 
codings  of  transition  systems  in  terms  of  logic  formulas.  In  Section  3  we  develop  the 
notions  of  reverse  and  direct  simulations  together  with  an  anytime  algorithm  for  com¬ 
puting  these  relations.  Reverse  and  direct  simulations  are  used  in  Section  4  to  state  a 
generic  fc-induction  principle  and  to  provide  suffi  cient  conditions  for  the  completeness 
of  these  inductions.  Sections  5  and  6  discuss  invariant  strengthening  and  lazy  quantifi  er 
elimination.  Experimental  results  with  fc-induction  and  invariant  strengthening  for  vari¬ 
ous  infi  nite-state  protocols,  timed  automata,  and  linear  hybrid  systems  are  summarized 
in  Section  7.  Comparisons  to  related  work  are  in  Section  8. 


91 


3 


2  Background 


Let  V  :=  {xt . . . . .  xn  }  be  a  set  of  variables  interpreted  over  nonempty  domains  T> i 
through  Vn ,  together  with  a  type  assignment  r  such  that  r(xj)  =  2?,.  For  a  set  of 
typed  variables  V ,  a  variable  assignment  is  a  function  v  from  variables  x  £  V  to  an 
element  of  r(x).  The  variables  in  V  :=  {xi, . . . ,  xn}  are  also  called  state  variables, 
and  a  program  state  is  a  variable  assignment  over  V . 

All  the  developments  in  this  paper  are  parametric  with  respect  to  a  given  constraint 
theories  C ,  such  as  linear  arithmetic  or  a  theory  of  bitvectors.  We  assume  a  computable 
function  for  deciding  satisfi  ability  of  a  conjunction  of  constraints  in  C.  A  set  of  Boolean 
constraints ,  Bool(C),  includes  all  constraints  in  C  and  is  closed  under  conjunction  A , 
disjunction  V ,  and  negation  -i.  Effective  solvers  for  deciding  the  satisfi  ability  problem 
in  Bool(C)  have  been  previously  described  [7, 6]. 

A  tuple  (V,  I,  T)  is  a  C-program  over  V ,  where  interpretations  of  the  typed  variables 
V  describe  the  set  of  states,  I  £  Bool(C(V))  is  a  predicate  that  describes  the  initial 
states,  andT  £  Bool(C(V  U  V'))  specifi  es  the  transition  relation  between  current  states 
and  their  successor  states  ( V  denotes  the  current  state  variables,  while  V'  stands  for  the 
next  state  variables).  The  semantics  of  a  program  is  given  in  terms  of  a  transition  system 
M  in  the  usual  way. 

For  a  program  M  =  { V,I,T ),  a  sequence  of  states  tt(so,  s±, . . .  ,sn)  forms  a 
path  through  M  if  /\0<i<n  T(si,  Si+i)-  A  state  s  is  reachable  in  M  if  there  is  a  path 
7r(so,  Si,  ,  sn_i,  s)  through  M  and  /(s0),  and  a  state  property  ip  £  C(V)  is  invariant 
in  M  iff  p(s)  holds  for  every  reachable  state  s  in  M.  A  counterexample  for  a  property 
p  is  a  path  7r(so,  ■  ■  ■ ,  sn)  such  that  /(«o)  and  -lV3('sn)-  and  the  length  len(n)  of  such  a 
counterexample  is  given  by  the  number  of  states  in  this  path. 

Typical  programming  constructs  can  be  rewritten  into  the  program  syntax  presented 
above.  For  example,  Dijkstra’s  guarded  commands  are  encoded  in  terms  of  a  disjunc¬ 
tion  of  conjunctions  of  guards  g(x i, . . . ,  xn )  and  updates  x\  =  /i(xi, . . . ,  xn )  for  all 
variables  X{.  Programs  with  external,  non-deterministic  inputs  are  defi  ned  by  partition¬ 
ing  the  set  of  variables  into  input  variables,  which  are  unconstrained,  and  the  other  state 
variables,  whose  next-state  values  are  constrained  by  the  transition  relation. 

Throughout  this  paper  we  use  timed  automata  [2],  which  are  state-transition  graphs 
augmented  with  a  fi  nite  set  of  real-valued  clocks,  as  a  prototypical  class  of  infi  nite-state 
systems.  Decidability  of  the  model-checking  problem  for  timed  automata  rests  on  the 
fact  that  the  space  of  clock  valuations  is  partitioned  into  fi  nitely  many  clock  regions. 
Two  clock  valuations  v\ .  V2  that  belong  to  the  same  region  are  (region)  equivalent, 
denoted  as  V\  ~ta  t'2  •  This  region  equivalence  is  a  stable  quotient  relation,  that  is, 
whenever  q  ~ta  u  and  T(q,  q1),  there  exists  a  state  u'  such  that  T(u,  u')  and  q'  ~ta 
u'  [2],  Encoding  of  timed  automata  in  terms  of  logical  programs  with  linear  arithmetic 
constraints  are  described  in  [19].  In  particular,  program  states  consist  of  a  location  and 
nonnegative  real  interpretations  of  clocks.  For  timed  automata  we  restrict  ourselves  to 
proving  so-called  clock  constraints  p,  such  that  q  ~Ta  u  implies  that  p(q)  iff  p{u). 
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3  Direct  and  Reverse  Simulation 


The  notions  of  direct  and  reverse  simulation  as  developed  here  lay  out  the  foundation 
for  the  completeness  results  in  Section  4. 


Definition  1  (Direct  /  Reverse  Simulation).  Let  M  =  { V ,  I,  T)  be  a  program  and  p 
a  state  formula  over  V .  We  defi  ne  the  functors  Fi  and  Fr  that  map  binary  relations  R 
over  V  in  the  following  way. 


Fd(R)(si,s2)  :=  | 
Fr(R)(si,s2)  :=  | 


if  -!<£>(« i)  then  ~ip(s2) 

else  \/s[  .  T(si,  4)  =f-  3s'2  ■  R(s[ ,  s2) 

if  I (s i)  then  I(s2) 

else  Vs[  .  T(s[,si)  3s2  .  R(s[,s2) 


A  T(s2,s2) 
A  T(4,s2) 


A  direct  simulation  over  V  with  respect  to  ip  is  any  binary  relation  A  over  V  that 
satisfi  es  A  C  Fd(F).  Similarly,  a  reverse  simulation  over  V  with  respect  to  ip  is  any 
binary  relation  A  over  V  that  satisfi  es  A  C  i£(A). 


In  contrast  to  reverse  simulations,  direct  simulations  depend  on  a  state  formula  p.  Also, 
the  defi  nition  of  direct  simulation  is  inspired  by  the  notion  of  stable  relations  above. 
Direct  (reverse)  simulations  are  usually  denoted  by  (fr).  The  following  direct  and 
reverse  simulations  are  used  as  running  examples  throughout  the  paper. 


Example  1.  The  empty  relation  aff)  —  false  is  a  direct  and  a  reverse  simulation. 

Example  2.  Equality  (=)  between  states  is  a  direct  and  a  reverse  simulation. 

Example  3.  The  relation  fis2  :=  I(s i)  A  I(s2)  is  a  reverse  simulation,  where  I  is 
the  predicate  for  describing  the  set  of  initial  states  of  the  given  program. 

Example  4.  Now,  consider  programs  (V,  I,  T)  with  inputs  such  that  input( x)  holds  iff 
x  is  an  input  variable.  The  relation 


Si  =i  s2  :=  for  all  variables x  €  V  ■  input(x)  or  s±(x )  =  s2(x), 

with  s(x)  denoting  the  value  of  the  variable  x  in  the  state  s,  is  a  reverse  simulation, 
since  the  values  of  the  input  variables  are  not  constrained  by  the  predicate  I  and  their 
next  values  are  not  constrained  by  T.  Obviously,  for  transition  systems  with  inputs,  the 
relation  si  =i  s2  is  weaker  than  =,  and  therefore  gives  rise  to  shorter  paths. 

Example  5.  We  now  consider  timed  automata  programs  and  clock  constraints.  The 
region  equivalence  ~  7-4,  which  give  rise  to  fi  nitely  many  clock  regions,  is  stable,  and 
therefore  a  direct  simulation. 


The  notions  of  direct  and  reverse  simulation  are  modular  in  the  sense  that  the  union 
of  direct  (reverse)  simulations  is  also  a  direct  (reverse)  simulation. 

Proposition  1  (Modularity).  If  Aj  and  <2  are  direct  (reverse)  simulations,  then  <  1 
U  A2  is  also  a  direct  (reverse)  simulation. 


93 


5 


This  property  follows  directly  from  the  deli  nitions  of  direct  (reverse)  simulations  in 
Deli  nition  1  and  from  the  monotonicity  of  the  functors  Fi  and  Fr.  For  example,  the 
reverse  simulations  Fi  and  =,-  in  Examples  3  and  4  may  be  combined  to  obtain  a  new 
reverse  simulation. 

Given  a  program  M  =  (V.  /,  T)  and  a  property  ip ,  the  associated  largest  direct  (re¬ 
verse)  simulation  relation  Fd  (Fr)  is  obtained  as  the  greatest  fi  xpoint  of  the  functor  Fi 
(Fr)  in  Deft  nition  1.  These  fi  xpoints  exist,  since  F  and  Fr  are  monotonic.  However,  the 
fi  xpoint  iterations  are  often  prohibitively  expensive,  and  a  direct  (reverse)  simulation  is 
only  obtained  on  convergence  of  the  iteration.  The  iteration  in  Proposition  2  provides 
a  viable  alternative  in  that  a  reverse  (direct)  simulation  is  reft  ned  to  obtain  a  stronger 
reverse  (direct)  simulation.  The  proof  of  the  proposition  below  follows  from  the  deft  - 
nitions  of  reverse  (direct)  simulations,  from  the  monotonicity  of  the  functors  Fr  (Fd), 
and  from  modularity  (Proposition  1). 

Proposition  2  (Anytime  Iteration).  If  Fr  (Fd)  is  a  reverse  (direct)  simulation,  then 
for  all  n  >  0  the  relation  Fr,n  ( Fd.,n )  is  also  a  reverse  (direct)  simulation: 

Fr, 0  • =  Fr  Fd.,0  • =  Fd 

Fr,n  •—  Fr,n— 1  UFr(^r,n— l)  Fd.,n  •  ~  Fd,n— 1  ^  Fd(Fd,n  —  l) 

Consequently,  this  iteration  gives  rise  to  an  anytime  algorithm  for  computing  direct 
(reverse)  simulations,  and  equality  =,  for  example,  may  be  used  as  seed,  since  it  is 
both  a  direct  and  a  reverse  simulation  (see  Example  2).  Also,  quantifi  er  elimination 
algorithms  such  as  the  one  in  Section  6  may  be  used  in  this  iteration. 

4  Completeness  of  fe-Induction 

Given  the  notions  of  direct  and  reverse  simulations,  we  develop  suffi  cient  conditions 
for  proving  completeness  of  fc-induction.  These  results  are  based  on  restricting  paths  to 
not  contain  states  that  are  similar  with  respect  to  a  given  direct  or  reverse  simulation. 
For  direct  (reverse)  simulations  we  deft  ne  a  compressed  path  w.r.t.  to  the  given  direct 
(reverse)  simulation  as  a  path  n(so,  s±, . . . ,  sn)  not  containing  any  Sj,  Sj  with  j  <  i 
( i  <  j)  such  that  s*  directly  (reversely)  simulates  Sj. 

Definition  2  (Path  Compression). 

-  A  path  7t-d  (so :  Si )  ■  ■  ■ :  sn)  is  compressed  w.r.t.  the  direct  simulation  Fd  if: 

7T— d  (sq  ,  Si ,  .  .  .  ,  Sn)  . —  tt(SQ ,  Si ,  .  .  .  ,  Sn)  A  SiFdSj- 

0  <j<i<n 

-  A  path  7 r-r  (so,s±, . . . ,  sn)  is  compressed  w.r.t.  the  reverse  simulation  Fr  if: 

7T-  r(sO;Si,,..,Sn)  :=  7r(s0:  ^1  j  ■  •  ■  j  Sn)  A  yy  SiFrSj  . 

0  <i<j<n 

A  path  that  is  compressed  with  respect  to  the  reverse  and  the  direct  simulations  Fr  and 
Fd  is  denoted  by  7t-r’d. 
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Fig.  1.  Incompleteness  of  fc-induction. 


For  example,  a  path  n(so, . . . ,  sn )  is  compressed  w.r.t.  the  reverse  simulation  (=) 
from  Example  2  iff  it  is  acyclic.  Moreover,  given  the  reverse  simulation  Aj  from  Ex¬ 
ample  3,  a  path  7r(so,  ■  ■  ■ ,  sn)  is  compressed  w.r.t.  iff  it  contains  at  most  one  initial 
state.  Obviously,  for  transition  systems  with  inputs,  the  relation  (= j)  (see  Example  4)  is 
weaker  than  (=),  and  therefore  give  rise  to  shorter  compressed  paths.  We  have  collected 
all  ingredients  for  deli  ning  fc-induction  for  arbitrarily  compressed  paths. 

Definition  3  (fc-Induction).  Let  M  =  (V,  I,  T)  be  a  program,  fc  an  integer,  <r  a  re¬ 
verse  simulation,  and  a  direct  simulation.  The  induction  scheme  of  depth  fc,  IND-r  d  (fc) 
allows  one  to  deduce  the  invariance  of  ip  in  M  if  the  following  holds. 

-  /(s0)  A  7v—r,d  (s0, ,  Sfe_i)  <p(s0)  A  ...  A  <^(sfc_i) 

—  A  ...  A  (p(sn_|_fc_i)  A  7r—r’d  (sn, . . . ,  s^-)-^)  y  <p(sn_)_fc) 

For  example,  given  the  empty  relationship  from  Example  1,  IND-®  reduces  to 
the  naive,  incomplete  fc-induction  on  arbitrary  paths.  Consider,  for  example,  the  system 
in  Figure  1  and  a  property  ip,  which  is  assumed  to  hold  only  in  q±.  Now,  the  execu¬ 
tion  sequence  q%  q,\  is  not  fc-inductive,  but  it  is  ruled  out  under 

^ v - " 

k 

the  acyclic  path  restriction.  The  complete  fc-induction  schemes  in  [17],  which  consider 
only  acyclic  paths  and  paths  that  only  visit  initial  states  once  can  be  recovered  by  instan¬ 
tiating  Deli  nition  3  with  the  relations  (=)  (Example  2)  and  ( A/ )  (Example  3),  respec¬ 
tively.  Since  both  (=)  and  (<i)  are  reverse  simulations,  an  induction  scheme  restricted 
to  acyclic  paths  visiting  initial  states  at  most  once  is  obtained  by  modularity  (Proposi¬ 
tion  1). 

Completeness  of  fc-induction  relies  heavily  on  the  notion  of  path  compression.  We 
now  state  the  main  lemma. 

Lemma  1  (Compressing  non-7r-r  d  paths).  Let  7t(.S'o, . . . ,  sn)  be  a  given  path;  then: 

1.  There  exists  a  7r-r-  compressed  path  7r-r  (qo, . . . ,  qm ),  s.t.  qm  =  sn  and  m  <n. 

2.  There  exists  a  n-d-  compressed  path  n-d  [q^. ... .  qm),  s.t.  q0  =  s0  and  rri  <n. 

Proofsketch.  Assume  a  path  n(so,  ■  ■  ■ ,  sn),  which  is  not  compressed  w.r.t.  -<r.  By 
Deft  nition  1  it  follows  that  there  are  states  Sj  G  7r(so,  ■  ■  ■ ,  sn)  such  that  Si^<rSj, 
and  i  <  j.  We  distinguish  two  cases.  First,  if  s,  is  an  initial  state,  then  so  is  Sj,  and 
therefore  a  shorter  path  n (sj  ,■■■■,  sn)  is  obtained  as  a  counterexample.  Second,  if  .5;  is 
not  an  initial  state,  then  ,s;  ^  so-  and  there  exists  a  Sj_i  such  that  T(s,_i,  Sj).  Since 
Si^<rSj  it  follows  by  Deli  nition  1  that  there  is  a  state  ,^_x,  such  that  and 

T(s'i_1,  Sj).  If  Si- 1  is  initial  state,  then  so  is  and  since  i  <  j  a  shorter  path 

7 r-r  (s'i_1 ,  Sj, . . . ,  sn)  is  obtained.  If  is  not  initial,  by  repeating  the  above  argument 
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a  shorter  path  is  constructed.  In  both  cases  a  shorter  path  is  obtained,  if  such  path  is  not 
a  compressed  path,  then  it  is  further  reduced.  The  proof  for  7t-d-  compressed  paths 
works  analogously. 

IND-r'd(fc)  is  complete  if:  </ns  an  invariant  of  M  iff  there  is  a  A;  such  that  IND-rd  ( k)(p ). 
Now,  completeness  of  fc-induction  follows  from  the  main  lemma  1  above. 

Theorem  1  (Completeness).  IND-r  d(fc)  is  a  complete  proof  method  iff  there  is  an 
upper  bound  on  the  length  of  the  paths  n-r’d  (so, . . . ,  sn). 

Using  the  simulation  from  Example  2,  Theorem  1  is  instantiated  to  obtain  the  following 
complete  /e-induction  for  fi  nite-state  systems. 

Corollary  1.  Let  M  be  a  fi  nite-state  program  over  V  and  ip  a  state  property  in  V ;  then 
IND=(fc)  induction  is  complete. 

In  general,  fc-induction  for  (=)  is  not  complete  for  infi  nite-state  systems.  Consider,  for 
example,  the  program  M  =  (I,  T)  over  the  integer  state  variable  x  with  I  =  (x  =  0) 
and  T  =  (x1  =  x  +  2),  and  the  formula  x  3.  Obviously,  it  is  the  case  that  x  ^  3  is 
invariant  in  M,  but  there  exists  no  k  £  IN  such  that  the  property  is  proven  by  IND=(fc)- 
However,  fc-induction  is  complete  for  timed  automata,  since  the  equivalence  relation 
~  ta  is  a  direct  simulation  (Example  5),  and  an  upper  bound  on  the  length  of  the  paths 
7r~TA(so,  ■  ■  ■ ,  s„ )  is  given  by  the  number  of  clock  regions. 

Corollary  2.  Let  M  be  a  timed  automata  program  over  the  clock  evaluations  C  and  p 
a  clock  constraint  in  C;  then  IND~Tj4(fc)  induction  is  complete. 

Similar  results  are  obtained  for  other  direct  and  reverse  simulations  and  combinations 
thereof. 


5  Invariant  Strengthening 

Whenever  ^-induction  fails  to  prove  a  property  <p,  there  is  a  counterexample  n  = 
sn,  s„+i, . . . ,  such  that  the  fi  rst  k  states  satisfy  p  whereas  the  last  state  does 
not  satisfy  this  property.  If  sn  is  indeed  reachable,  then  p  is  not  invariant.  Otherwise, 
the  counterexample  is  labeled  as  spurious  and  it  is  inconclusive  whether  ip  is  invariant 
or  not.  However,  by  assuming  sn  to  be  unreachable,  such  a  spurious  counterexample  is 
used  to  obtain  a  strengthened  invariant  p  A  ->  (sn). 

Consider,  for  example,  the  property  -1(94)  for  the  system  in  Figure  1.  Induction 
of  depth  k  =  1  fails,  and  the  counterexample  q$  <74  is  obtained.  Now,  — 1  ( <74 )  is 
strengthened  to  obtain  -1(^4)  A  ^(<73),  which  is  proven  using  1-induction.  More  gener¬ 
ally,  whenever  the  induction  step  of  IND-r  d(fc)  fails,  the  formula  Q(sn, . . . ,  sn+k)  '■= 
tp(sn)  A  ...  A<y9(s„+fc_i)A7T^''-d(s„,...,sn+fc)A-.<y9(sn+fc)  is  satisfi  able,  and  each 
satisfying  assignment  describes  a  counterexample  for  the  induction  step.  Thus,  we  de¬ 
li  ne  the  predicate  U  ( s )  for  representing  the  set  of  possibly  unreachable  states,  which 
may  reach  the  bad  state  in  k  steps  by  means  of  a7r-r’d  path,  U(s )  =  3  s„+i, . . . ,  sn+k-Q(s,  Sn+i,  ■  ■  ■ ,  Sn+fc)- 
Now,  p  is  strengthened  as  p  A  -1 U (s),  and  quantifi  er  elimination  is  used  for  transform¬ 
ing  this  strengthened  formula  into  an  equivalent  Boolean  constraint  formula.  For  the 
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y  2  =  0V  yl  =  0  V 


yl'  :=  0  y2'  :=  0 

Fig.  2.  Bakery  Mutual  Exclusion  Protocol. 


general  case,  we  use  the  quantifi  er  elimination  procedure  in  Section  6.  Notice,  how¬ 
ever,  that  for  special  cases  such  as  guarded  command  languages,  the  quantifi  ers  in  U  ( s ) 
are  eliminated  using  purely  syntactic  operations  such  as  substitution,  since  all  quan¬ 
tifi  cations  are  over  “next-state”  variables  x  for  which  there  are  explicit  solutions  /(.). 
An  example  might  help  to  illustrate  the  combination  of  fc-induction,  strengthening,  and 
quantifi  er  elimination. 

Example  6.  Consider  the  usual  stripped-down  version  of  Lamport’s  Bakery  protocol  in 
Figure  2  with  the  initial  value  0  for  both  counters  yl  and  y 2  and  the  mutual  exclusion 
property  MX  defined  by  -i (pel  =  o3  A  pc2  =  63).  We  apply  3-induction  with  the 
empty  simulation  relation  The  base  step  holds  and  the  induction  step  fails,  thus  we 
obtain 


U (Sn)  • —  ^  ,  5n+2  5  Sro+3*  MX  (sn)  A  M A  (sn_)_i  )  A 

MX  (Sro-1-2)  A  7T  — 0  (t>m  ^n+lj  Sn+2:  ^n+3)  A  ““ 1  MX  (sn_j_3) 

with  states  Si  of  the  form  (pc  1 , , y  1 , . pc2 i.y‘2i).  Since  the  transitions  of  the  Bakery 
protocol  are  in  terms  of  guarded  commands,  simple  substitution  is  used  to  obtain  a 
quantifi  er-eliminated  form,  i?(s),  defi  ned  as 

R(s )  :=  (pci  =  al  Apc2  =  62  /\y2  =  0)  V  (pci  =  a2  Apc2  =  61  Ayl  =  0). 

Now,  the  strengthened  property  MX(s)  A  -> R(s)  is  proven  using  3-induction. 

6  Quantifi  er  elimination 

Given  a  quantified  formula  3i>ars.  ip  with  tp  £  Bool(C),  quantifier-elimination  pro¬ 
cedures  usually  work  by  transforming  <p  into  disjunctive  normal  form  (DNF)  and  dis¬ 
tributing  the  existential  quantifi  ers  over  disjunctions.  Thus,  one  is  left  with  eliminating 
quantifi  ers  from  a  set  of  existentially  quantifi  ed  conjunctions  of  literals.  We  assume  as 
given  such  a  procedure  C-qe.  The  main  drawback  of  these  procedures  is  that  there  is  a 
potential  exponential  blowup  in  the  initial  transformation  to  DNF  and  C-qe  might  even 
return  further  disjunctions  (as  is  the  case  for  Presburger  arithmetic);  this  problem  has 
been  addressed  for  the  Boolean  case  by  McMillan  [14]. 

The  quantifi  er  elimination  problem  for  invariant  strengthening,  as  discussed  in  Sec¬ 
tion  5  allows  for  a  purely  syntactic  quantifi  er  elimination  as  long  as  we  are  restricting 
ourselves  to  guarded  command  programs.  In  these  cases,  C-qe  just  applies  the  substitu¬ 
tion  rule  ( x  ^  vars(ip )) 

(3x.(x  =  il>)  A  p(x))  iff 
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procedure  qe(vars,  p) 
ip  :=  false 

loop 

c  :=  next-solution (p) 
if  c  =  false  then  return  ip 

c'  :=  C-qe(vars,c ) 
ip  \=  ip\J  c1 
p  p  A  -i c' 


Fig.  3.  Lazy  Quantifier  Elimination. 


possibly  followed  by  simplifi  cation.  Quantifi  er  elimination  by  substitution  has  already 
been  used  in  the  context  of  model  checking,  for  example,  by  Coudert,  Berthet,  and 
Madre  [15]  and  more  recently  by  Williams,  Biere,  Clarke,  Gupta  [20],  and  Abdulla, 
Bjesse,  Een  [1],  Another  C-qe  function  is  used  in  McMillan’s  [14]  quantifi  er  elimination 
algorithm  based  on  propositional  SAT  solving,  in  that  his  C-qe  (vars,  c)  simply  deletes 
the  literals  in  c,  which  contain  a  variable  in  vars.  In  contrast,  depending  on  the  back¬ 
ground  theory,  arbitrary  complex  quantifi  er  elimination  procedures,  such  as  the  ones  for 
Presburger  arithmetic  or  real-closed  fi  elds,  can  also  be  used  here. 

As  motivated  above,  the  initial  DNF  computation  should  usually  be  avoided  when 
possible.  Given  a  set  of  existentially  quantifi  ed  variables  vars  and  a  quantifi  er-free  for¬ 
mula  ip  in  Bool(C),  the  algorithm  qe(vars ,  p)  in  Figure  3  returns  a  formula  in  Bool(C) 
which  is  equivalent  to  3vars.  <p.  The  procedure  qe  relies  on  a  satisfi  ability  solver  for 
formulas  pi  €  Bool(C),  which  is  assumed  to  enumerate  representations  of  sets  of  sat¬ 
isfi  able  models  in  terms  of  conjunctions  of  literals  in  p.  Such  a  solver  is  described,  for 
example,  in  [7,6].  These  solutions  are  supposed  to  be  enumerated  by  successive  calls 
to  next-solution  in  Figure  3.  Since  there  are  only  a  fi  nite  number  of  solutions  in  terms  of 
subsets  of  literals,  the  function  qe  is  terminating.  Moreover,  minimal  solutions  or  good 
over-approximations  thereof,  as  produced  by  the  lazy  theorem  proving  algorithm  [7, 6], 
accelerate  convergence. 

The  variable  c  in  Figure  3  stores  the  current  solution  obtained  by  next-solution,  and 
the  procedure  C-qe  applies  quantifi  er  elimination  for  conjunction.  In  many  cases,  C-qe 
just  applies  the  substitution  rule  to  remove  quantifi  ed  variables.  In  order  to  obtain  the 
next  set  of  solutions,  we  rule  out  the  current  solutions  by  updating  p  with  the  value  -i c' 
instead  of  -i c,  since  -i d  is  more  restrictive. 

Thus,  the  quantifi  er  elimination  procedure  in  Figure  3  avoids  eager  computation  of 
a  disjunctive  normal  form.  Moreover,  a  solver  for  Bool(C)  is  used  to  guide  the  search 
for  relevant  “conjunctions”  in  p.  In  this  way,  the  qe  algorithm  extends  the  lazy  theorem 
proving  paradigm  described  in  [7, 6]  to  the  case  of  fi  rst-order  reasoning. 

Example  7.  Consider 

3xi ,  yi  ((x0  =  1  V  x0  =  3  V  y0  >  1)  A  aq  =  x0  -  1 A  y1  =  y0  +  1) 

V  ((x0  =  — 1  Vx0  =  -3)  Axi  =  x0  +  2  Ayi  =  y0  -  1))  Axi  <  0 
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A  fi  rst  satisfi  able  conjunction  of  literals  is  obtained  by,  say 

c:=y0  >  1 A  xi  =  x0  —  1  A  yi  =  y0  +  1 A  xi  <0. 

Now,  application  of  the  substitution  rule  yields  c'  :=  yo  >  1  A  Xq  —  1  <  0,  and,  after 
updating  ip  with  -i d  a  second  solution  is  obtained  as 

c\—  x o  =  —  3  A  x±  =  xo  +  2  A  yi  =  yo  —  1 A  x\  <0. 

Again,  applying  the  substitution  rule,  one  gets  c'  :=  Xq  =  —3  A  Xq  +  2  <  0,  and,  since 
there  are  no  further  solutions,  the  quantifi  er-eliminated  formula  is  [y^  >  1  A  Xq  —  1  < 
0)  V  Oo  =  -3Ax0  +  2  <  0). 

7  Experiments 


We  describe  some  of  our  experiments  with  fc-induction  and  invariant  strengthening. 
Our  benchmark  examples  include  infi  nite-state  systems  such  as  communication  proto¬ 
cols,  timed  automata  and  linear  hybrid  systems. 1  In  particular.  Table  1  contains  experi¬ 
mental  results  for  the  Bakery  protocol  as  described  earlier,  Simpson’s  protocol  [18]  to 
avoid  interference  between  concurrent  reads  and  writes  in  a  fully  asynchronous  system, 
well-known  timed  automata  benchmarks  such  as  the  train  gate  controller  and  Fischer’s 
mutual  exclusion  protocol,  and  three  linear  hybrid  automata  benchmarks  for  water  level 
monitoring,  the  leaking  gas  burner,  and  the  multi-rate  Fischer  protocol.  Timed  automata 
and  linear  hybrid  systems  are  encoded  as  in  [19].  Starting  with  k  =  1  we  increase  k 
until  fc-induction  succeeds.  We  are  using  invariant  strengthening  only  in  cases  where 
syntactic  quantifi  er  elimination  based  on  substitution  suffi  ces.  In  particular,  we  do  not 
use  strengthening  for  the  timed  and  hybrid  automata  examples,  that  is,  C-qe  tries  to  apply 
the  substitution  rule,  if  the  resulting  satisfi  ability  problems  for  Boolean  combinations 
of  linear  arithmetic  constraints  are  solved  using  the  lazy  theorem  proving  algorithm 
described  in  [7]  and  implemented  in  the  ICS  decision  procedures  [9]. 


System  Name 

Proved  with  k 

|Time 

Refinements 

Bakery  Protocol 

3 

1 

Simpson  Protocol 

2 

2 

Train  Gate  Controller 

5 

0 

Fischer  Protocol 

4 

0 

Water  Level  Monitor 

1 

0 

Leaking  Gas  Burner 

6 

0 

Multi  Rate  Fischer 

4 

0 

Table  1.  Results  for  k-induction.  Timings  are  in  seconds. 


1  These  benchmarks  are  available  at  http://www.csl.sri.com/~demoura/cav03examples 
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The  experimental  results  in  Table  1  are  obtained  on  a  2GHz  Pentium-IV  with  1Gb 
of  memory.  The  second  column  in  Table  1  lists  the  minimal  k  for  which  ^-induction 
succeeds,  the  third  column  includes  the  total  time  (in  seconds)  needed  for  all  inductions 
from  0  to  k,  and  the  fourth  column  the  number  of  strengthenings.  Timings  do  not  include 
the  one  for  quantifi  er  elimination,  since  we  restricted  ourselves  to  syntactic  quantifi  er 
elimination  only.  Notice  that  invariant  strengthening  is  essential  for  the  proofs  of  the 
Bakery  protocol  and  Simpson’s  protocol,  since  k-induction  alone  does  not  succeed  for 
any  k. 

Simpson’s  protocol  for  avoiding  interference  between  concurrent  reads  and  writes 
in  a  fully  asynchronous  system  has  also  been  studied  using  traditional  model  check¬ 
ing  techniques.  Using  an  explicit-state  model  checker,  Rushby  [16]  demonstrates  cor¬ 
rectness  of  a  fi  nitary  version  of  this  potentially  infi  nite-state  problem.  Whereas  it  took 
around  100  seconds  for  the  model  checker  to  verify  this  stripped-down  problem,  k- 
induction  together  with  invariant  strengthening  proves  the  general  problem  in  a  fraction 
of  a  second.  Moreover,  other  nontrivial  problems  such  as  correctness  of  Illinois  and 
Futurebus  cache  coherence  protocols,  as  given  by  [8],  are  easily  established  using  1- 
induction  with  only  one  round  of  strengthening. 


8  Related  Work 


We  restrict  this  comparison  to  work  we  think  is  most  closely  related  to  ours.  Sheeran, 
Singh,  and  Stalmarck’s  [17]  also  use  k-induction,  but  their  approach  is  restricted  to 
fi  nite-state  systems  only.  They  consider  fc-induction  restricted  to  acyclic  paths  and  each 
path  is  constrained  to  contain  at  most  one  initial  state.  These  inductions  are  simple 
instances  of  our  general  induction  scheme  based  on  reverse  and  direct  simulations. 
Moreover,  invariant  strengthening  is  used  here  to  decrease  the  minimal  k  for  which 
fc-induction  succeeds. 

Our  path  compression  techniques  can  also  be  used  to  compute  tight  completeness 
thresholds  for  BMC.  For  example,  a  compressed  recurrence  diameter  is  defi  ned  as  the 
smallest  n  such  that  I(so )  A  n-r’d  (so,  ■  ■  ■ ,  sn )  is  unsatisfi  able.  Using  equality  (=)  for 
the  simulation  relation,  this  formula  is  equivalent  to  the  recurrence  diameter  in  [4],  A 
tighter  bound  of  the  recurrence  diameter,  where  values  of  input  variables  are  ignored,  is 
obtained  by  using  the  reverse  simulation  =t.  In  this  way,  the  results  in  [12]  are  obtained 
as  specifi  c  instances  in  our  general  framework  based  on  reverse  and  direct  simulations. 
In  addition,  the  compressed  diameter  is  defi  ned  as  the  smallest  n  such  that 

n— 1 

I(so)  A  7T— r>d(50,  .  .  .  ,  Sn)  A  f\  -Wfr’d(so,Si) 

i= 0 

is  unsatisfi  able,  where  7ifr’d(so,  s«)  :=  3,s1; . . . ,  Sj_i.7r-r’d(so,  si,  ■  ■  ■ ,  Sj-i,  s*)  holds 
if  there  is  a  relevant  path  from  s o  to  Si  with  i  steps.  Depending  on  the  simulation  re¬ 
lation,  this  compressed  diameter  yields  tighter  bounds  for  the  completeness  thresholds 
than  the  ones  usually  used  in  BMC  [4], 
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9  Conclusion 

We  developed  a  general  fc-induction  scheme  based  on  the  notion  of  reverse  and  direct 
simulation,  and  we  studied  completeness  of  these  inductions.  Although  any  fc-induction 
proof  can  be  reduced  to  a  1-induction  proof  with  invariant  strengthening,  there  are  cer¬ 
tain  advantages  of  using  fc-induction.  In  particular,  bugs  of  length  k  are  detected  in  the 
initial  step,  and  the  number  of  strengthenings  required  to  complete  a  proof  is  reduced 
signifi  cantly.  For  example,  a  1-induction  proof  of  the  Bakery  protocol  requires  three 
successive  strengthenings  each  of  which  produces  4  new  clauses.  There  is,  however, 
a  clear  trade-off  between  the  additional  cost  of  using  fc-induction  and  the  number  of 
strengthenings  required  in  1 -induction,  which  needs  to  be  studied  further. 
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